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Preface to Pfeiffer Applied Probability 1 



The course 

This is a "first course" in the sense that it presumes no previous course in probability. The units are 
modules taken from the unpublished text: Paul E. Pfeiffer, ELEMENTS OF APPLIED PROBABILITY, 
USING MATLAB. The units are numbered as they appear in the text, although of course they may be used 
in any desired order. For those who wish to use the order of the text, an outline is provided, with indication 
of which modules contain the material. 

The mathematical prerequisites are ordinary calculus and the elements of matrix algebra. A few standard 
series and integrals are used, and double integrals are evaluated as iterated integrals. The reader who can 
evaluate simple integrals can learn quickly from the examples how to deal with the iterated integrals used 
in the theory of expectation and conditional expectation. Appendix B (Section 17.2) provides a convenient 
compendium of mathematical facts used frequently in this work. And the symbolic toolbox, implementing 
MAPLE, may be used to evaluate integrals, if desired. 

In addition to an introduction to the essential features of basic probability in terms of a precise mathe- 
matical model, the work describes and employs user defined MATLAB procedures and functions (which we 
refer to as m-programs, or simply programs) to solve many important problems in basic probability. This 
should make the work useful as a stand alone exposition as well as a supplement to any of several current 
textbooks. 

Most of the programs developed here were written in earlier versions of MATLAB, but have been revised 
slightly to make them quite compatible with MATLAB 7. In a few cases, alternate implementations are 
available in the Statistics Toolbox, but are implemented here directly from the basic MATLAB program, 
so that students need only that program (and the symbolic mathematics toolbox, if they desire its aid in 
evaluating integrals). 

Since machine methods require precise formulation of problems in appropriate mathematical form, it 
is necessary to provide some supplementary analytical material, principally the so-called minterm analysis. 
This material is not only important for computational purposes, but is also useful in displaying some of the 
structure of the relationships among events. 

A probability model 

Much of "real world" probabilistic thinking is an amalgam of intuitive, plausible reasoning and of statistical 
knowledge and insight. Mathematical probability attempts to to lend precision to such probability analysis 
by employing a suitable mathematical model, which embodies the central underlying principles and structure. 
A successful model serves as an aid (and sometimes corrective) to this type of thinking. 

Certain concepts and patterns have emerged from experience and intuition. The mathematical formu- 
lation (the mathematical model) which has most successfully captured these essential ideas is rooted in 
measure theory, and is known as the Kolmogorov model, after the brilliant Russian mathematician A.N. 
Kolmogorov (1903-1987). 
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One cannot prove that a model is correct. Only experience may show whether it is useful (and not 
incorrect). The usefulness of the Kolmogorov model is established by examining its structure and show- 
ing that patterns of uncertainty and likelihood in any practical situation can be represented adequately. 
Developments, such as in this course, have given ample evidence of such usefulness. 

The most fruitful approach is characterized by an interplay of 

• A formulation of the problem in precise terms of the model and careful mathematical analysis of the 
problem so formulated. 

• A grasp of the problem based on experience and insight. This underlies both problem formulation 
and interpretation of analytical results of the model. Often such insight suggests approaches to the 
analytical solution process. 

MATLAB: A tool for learning 

In this work, we make extensive use of MATLAB as an aid to analysis. I have tried to write the MATLAB 
programs in such a way that they constitute useful, ready-made tools for problem solving. Once the user 
understands the problems they are designed to solve, the solution strategies used, and the manner in which 
these strategies are implemented, the collection of programs should provide a useful resource. 

However, my primary aim in exposition and illustration is to aid the learning process and to deepen 
insight into the structure of the problems considered and the strategies employed in their solution. Several 
features contribute to that end. 

1. Application of machine methods of solution requires precise formulation. The data available and the 
fundamental assumptions must be organized in an appropriate fashion. The requisite discipline for 
such formulation often contributes to enhanced understanding of the problem. 

2. The development of a MATLAB program for solution requires careful attention to possible solution 
strategies. One cannot instruct the machine without a clear grasp of what is to be done. 

3. I give attention to the tasks performed by a program, with a general description of how MATLAB 
carries out the tasks. The reader is not required to trace out all the programming details. However, 
it is often the case that available MATLAB resources suggest alternative solution strategies. Hence, 
for those so inclined, attention to the details may be fruitful. I have included, as a separate collection, 
the m-files written for this work. These may be used as patterns for extensions as well as programs in 
MATLAB for computations. Appendix A (Section 17.1) provides a directory of these m-files. 

4. Some of the details in the MATLAB script are presentation details. These are refinements which are 
not essential to the solution of the problem. But they make the programs more readily usable. And 
they provide illustrations of MATLAB techniques for those who may wish to write their own programs. 
I hope many will be inclined to go beyond this work, modifying current programs or writing new ones. 

An Invitation to Experiment and Explore 

Because the programs provide considerable freedom from the burden of computation and the tyranny of 
tables (with their limited ranges and parameter values), standard problems may be approached with a new 
spirit of experiment and discovery. When a program is selected (or written), it embodies one method of 
solution. There may be others which are readily implemented. The reader is invited, even urged, to explore! 
The user may experiment to whatever degree he or she finds useful and interesting. The possibilities are 
endless. 
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Chapter 1 

Probability Systems 



1.1 Likelihood 1 

1.1.1 Introduction 

Probability models and techniques permeate many important areas of modern life. A variety of types of 
random processes, reliability models and techniques, and statistical considerations in experimental work play 
a significant role in engineering and the physical sciences. The solutions of management decision problems 
use as aids decision analysis, waiting line theory, inventory theory, time series, cost analysis under uncertainty 
— all rooted in applied probability theory. Methods of statistical analysis employ probability analysis as an 
underlying discipline. 

Modern probability developments are increasingly sophisticated mathematically. To utilize these, the 
practitioner needs a sound conceptual basis which, fortunately, can be attained at a moderate level of 
mathematical sophistication. There is need to develop a feel for the structure of the underlying mathematical 
model, for the role of various types of assumptions, and for the principal strategies of problem formulation 
and solution. 

Probability has roots that extend far back into antiquity. The notion of "chance" played a central role in 
the ubiquitous practice of gambling. But chance acts were often related to magic or religion. For example, 
there are numerous instances in the Hebrew Bible in which decisions were made "by lot" or some other 
chance mechanism, with the understanding that the outcome was determined by the will of God. In the 
New Testament, the book of Acts describes the selection of a successor to Judas Iscariot as one of "the 
Twelve." Two names, Joseph Barsabbas and Matthias, were put forward. The group prayed, then drew lots, 
which fell on Matthias. 

Early developments of probability as a mathematical discipline, freeing it from its religious and magical 
overtones, came as a response to questions about games of chance played repeatedly. The mathematical 
formulation owes much to the work of Pierre de Fermat and Blaise Pascal in the seventeenth century. The 
game is described in terms of a well defined trial (a play); the result of any trial is one of a specific set of 
distinguishable outcomes. Although the result of any play is not predictable, certain "statistical regularities" 
of results are observed. The possible results are described in ways that make each result seem equally likely. 
If there are JV such possible "equally likely" results, each is assigned a probability 1/N. 

The developers of mathematical probability also took cues from early work on the analysis of statistical 
data. The pioneering work of John Graunt in the seventeenth century was directed to the study of "vital 
statistics," such as records of births, deaths, and various diseases. Graunt determined the fractions of people 
in London who died from various diseases during a period in the early seventeenth century. Some thirty 
years later, in 1693, Edmond Halley (for whom the comet is named) published the first life insurance tables. 
To apply these results, one considers the selection of a member of the population on a chance basis. One 
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6 CHAPTER 1. PROBABILITY SYSTEMS 

then assigns the probability that such a person will have a given disease. The trial here is the selection of 
a person, but the interest is in certain characteristics. We may speak of the event that the person selected 
will die of a certain disease- say "consumption." Although it is a person who is selected, it is death from 
consumption which is of interest. Out of this statistical formulation came an interest not only in probabilities 
as fractions or relative frequencies but also in averages or expectatons. These averages play an essential role 
in modern probability. 

We do not attempt to trace this history, which was long and halting, though marked by flashes of 
brilliance. Certain concepts and patterns which emerged from experience and intuition called for clarifica- 
tion. We move rather directly to the mathematical formulation (the "mathematical model") which has most 
successfully captured these essential ideas. This is the model, rooted in the mathematical system known as 
measure theory, is called the Kolmogorov model, after the brilliant Russian mathematician A.N. Kolmogorov 
(1903-1987). Kolmogorov succeeded in bringing together various developments begun at the turn of the cen- 
tury, principally in the work of E. Borel and H. Lebesgue on measure theory. Kolmogorov published his 
epochal work in German in 1933. It was translated into English and published in 1956 by Chelsea Publishing 
Company. 

1.1.2 Outcomes and events 

Probability applies to situations in which there is a well defined trial whose possible outcomes are found 
among those in a given basic set. The following are typical. 

• A pair of dice is rolled; the outcome is viewed in terms of the numbers of spots appearing on the top 
faces of the two dice. If the outcome is viewed as an ordered pair, there are thirty six equally likely 
outcomes. If the outcome is characterized by the total number of spots on the two die, then there are 
eleven possible outcomes (not equally likely). 

• A poll of a voting population is taken. Outcomes are characterized by responses to a question. For 
example, the responses may be categorized as positive (or favorable), negative (or unfavorable), or 
uncertain (or no opinion). 

• A measurement is made. The outcome is described by a number representing the magnitude of the 
quantity in appropriate units. In some cases, the possible values fall among a finite set of integers. In 
other cases, the possible values may be any real number (usually in some specified interval). 

• Much more sophisticated notions of outcomes are encountered in modern theory. For example, in 
communication or control theory, a communication system experiences only one signal stream in its 
life. But a communication system is not designed for a single signal stream. It is designed for one of 
an infinite set of possible signals. The likelihood of encountering a certain kind of signal is important 
in the design. Such signals constitute a subset of the larger set of all possible signals. 

These considerations show that our probability model must deal with 

• A trial which results in (selects) an outcome from a set of conceptually possible outcomes. The trial 
is not successfully completed until one of the outcomes is realized. 

• Associated with each outcome is a certain characteristic (or combination of characteristics) pertinent 
to the problem at hand. In polling for political opinions, it is a person who is selected. That person 
has many features and characteristics (race, age, gender, occupation, religious preference, preferences 
for food, etc.). But the primary feature, which characterizes the outcome, is the political opinion on 
the question asked. Of course, some of the other features may be of interest for analysis of the poll. 

Inherent in informal thought, as well as in precise analysis, is the notion of an event to which a probability 
may be assigned as a measure of the likelihood the event will occur on any trial. A successful mathematical 
model must formulate these notions with precision. An event is identified in terms of the characteristic of 
the outcome observed. The event "a favorable response" to a polling question occurs if the outcome observed 
has that characteristic; i.e., iff (if and only if) the respondent replies in the affirmative. A hand of five cards 
is drawn. The event "one or more aces" occurs iff the hand actually drawn has at least one ace. If that same 



hand has two cards of the suit of clubs, then the event "two clubs" has occurred. These considerations lead 
to the following definition. 

Definition. The event determined by some characteristic of the possible outcomes is the set of those 
outcomes having this characteristic. The event occurs iff the outcome of the trial is a member of that set 
(i.e., has the characteristic determining the event). 

• The event of throwing a "seven" with a pair of dice (which we call the event SEVEN) consists of the 
set of those possible outcomes with a total of seven spots turned up. The event SEVEN occurs iff the 
outcome is one of those combinations with a total of seven spots (i.e., belongs to the event SEVEN). 
This could be represented as follows. Suppose the two dice are distinguished (say by color) and a 
picture is taken of each of the thirty six possible combinations. On the back of each picture, write the 
number of spots. Now the event SEVEN consists of the set of all those pictures with seven on the 
back. Throwing the dice is equivalent to selecting randomly one of the thirty six pictures. The event 
SEVEN occurs iff the picture selected is one of the set of those pictures with seven on the back. 

• Observing for a very long (theoretically infinite) time the signal passing through a communication 
channel is equivalent to selecting one of the conceptually possible signals. Now such signals have many 
characteristics: the maximum peak value, the frequency spectrum, the degree of differentibility, the 
average value over a given time period, etc. If the signal has a peak absolute value less than ten volts, 
a frequency spectrum essentially limited from 60 herz to 10,000 herz, with peak rate of change 10,000 
volts per second, then it is one of the set of signals with those characteristics. The event "the signal has 
these characteristics" has occured. This set (event) consists of an uncountable infinity of such signals. 

One of the advantages of this formulation of an event as a subset of the basic set of possible outcomes is that 
we can use elementary set theory as an aid to formulation. And tools, such as Venn diagrams and indicator 
functions (Section 1.3) for studying event combinations, provide powerful aids to establishing and visualizing 
relationships between events. We formalize these ideas as follows: 

• Let Q, be the set of all possible outcomes of the basic trial or experiment. We call this the basic space 
or the sure event, since if the trial is carried out successfully the outcome will be in O; hence, the event 
Q, is sure to occur on any trial. We must specify unambiguously what outcomes are "possible." In 
flipping a coin, the only accepted outcomes are "heads" and "tails." Should the coin stand on its edge, 
say by leaning against a wall, we would ordinarily consider that to be the result of an improper trial. 

• As we note above, each outcome may have several characteristics which are the basis for describing 
events. Suppose we are drawing a single card from an ordinary deck of playing cards. Each card is 
characterized by a "face value" (two through ten, jack, queen, king, ace) and a "suit" (clubs, hearts, 
diamonds, spades). An ace is drawn (the event ACE occurs) iff the outcome (card) belongs to the 
set (event) of four cards with ace as face value. A heart is drawn iff the card belongs to the set of 
thirteen cards with heart as suit. Now it may be desirable to specify events which involve various 
logical combinations of the characteristics. Thus, we may be interested in the event the face value 
is jack or king and the suit is heart or spade. The set for jack or king is represented by the union 
JU K and the set for heart or spade is the union H U S. The occurrence of both conditions means the 
outcome is in the intersection (common part) designated by n. Thus the event referred to is 

E= (JUK)n(HUS) (1.1) 

The notation of set theory thus makes possible a precise formulation of the event E. 

• Sometimes we are interested in the situation in which the outcome does not have one of the charac- 
teristics. Thus the set of cards which does not have suit heart is the set of all those outcomes not in 
event H. In set theory, this is the complementary set (event) H c . 

• Events are mutually exclusive iff not more than one can occur on any trial. This is the condition that 
the sets representing the events are disjoint (i.e., have no members in common). 

• The notion of the impossible event is useful. The impossible event is, in set terminology, the empty 
set0. Event cannot occur, since it has no members (contains no outcomes). One use of is to 
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provide a simple way of indicating that two sets are mutually exclusive. To say AB = (here we 
use the alternate AB for An B) is to assert that events A and B have no outcome in common, hence 
cannot both occur on any given trial. 
• Set inclusion provides a convenient way to designate the fact that event A implies event B, in the sense 
that the occurrence of A requires the occurrence of B. The set relation A C B signifies that every 
element (outcome) in A is also in B. If a trial results in an outcome in A (event A occurs), then that 
outcome is also in B (so that event B has occurred). 

The language and notaton of sets provide a precise language and notation for events and their combinations. 
We collect below some useful facts about logical (often called Boolean) combinations of events (as sets). The 
notion of Boolean combinations may be applied to arbitrary classes of sets. For this reason, it is sometimes 
useful to use an index set to designate membership. We say the index J is countable if it is finite or countably 
infinite; otherwise it is uncountable. In the following it may be arbitrary. 

{Ai : i e J} is the class of sets Ai, one for each index % in the index set J (1-2) 

For example, if J = {1, 2, 3} then {Ai : i s J} is the class {Ai, A 2 , A3}, and 

|J A t = A 1 U A 2 U A3, P| Ai = Ai n A 2 n A3, (1.3) 

ieJ ieJ 

If J = {1, 2, •••} then {Ai : i s J} is the sequence {A\ : 1 < i}. and 

00 00 

\jA i =[JA i , f]A i = f]A i (1.4) 

zGJ i=l i£J i= 1 

If event E is the union of a class of events, then event E occurs iff at least one event in the class occurs. If 
F is the intersection of a class of events, then event F occurs iff all events in the class occur on the trial. 

The role of disjoint unions is so important in probability that it is useful to have a symbol indicating 
the union of a disjoint class. We use the big V to indicate that the sets combined in the union are disjoint. 
Thus, for example, we write 

n n 

A = \J Ai to signify A = M Ai with the proviso that the Ai form a disjoint class (1-5) 

Example 1.1: Events derived from a class 

Consider the class {E\, E 2 , E3} of events. Let A^ be the event that exactly k occur on a trial and 
Bk be the event that k or more occur on a trial. Then 

A) = EfE c 2 EI, A 1 = E x ElE\\j E\E 2 El\j E{EIE Z A 2 = (1.6) 

E 1 E 2 E^ V E 1 E 2 E 3 V E^E 2 E 3 , A 3 = E 1 E 2 E 3 

The unions are disjoint since each pair of terms has E; in one and Ef in the other, for at least 
one i. Now the B^ can be expressed in terms of the A^. For example 

B 2 = A 2 \/A 3 (1.7) 

The union in this expression for B2 is disjoint since we cannot have exactly two of the E; occur 
and exactly three of them occur on the same trial. We may express B2 directly in terms of the E; 
as follows: 

B 2 = E X E 2 U E X E3 U E 2 E 3 (1.8) 



Here the union is not disjoint, in general. However, if one pair, say {E\, E 3 } is disjoint, then 
E\E 3 = and the pair \E\E 2 , E 2 E 3 } is disjoint (draw a Venn diagram). Suppose C is the event 
the first two occur or the last two occur but no other combination. Then 

C = E X E 2 E C 3 \/ E\E 2 E 3 (1.9) 

Let D be the event that one or three of the events occur. 

D = A 1 \jA 3 = E x E c 2 E% \j E\E 2 E C 3 \J E^E C 2 E 3 \J E 1 E 2 E 3 (1.10) 

Two important patterns in set theory known as DeMorgan's rules are useful in the handling of events. For 
an arbitrary class {Ai : i s J} of events, 






C\ A i and 



.iGJ 



\JA? (1.11) 



An outcome is not in the union (i.e., not in at least one) of the A; iff it fails to be in all A;, and it is not in 
the intersection (i.e. not in all) iff it fails to be in at least one of the A;. 

Example 1.2: Continuation of Example 1.1 (Events derived from a class) 

Express the event of no more than one occurrence of the events in {E\, E 2 , E 3 } as B 2 C . 

B c 2 = \E X E 2 U E X E 3 U E 2 E 3 ] C = {El U E c 2 ) {E\ U E c 3 ) [E\E c 3 ) = E{E C 2 U E{E C 3 U E C 2 E C 3 (1.12) 

The last expression shows that not more than one of the E\ occurs iff at least two of them fail to 
occur. 



1.2 Probability Systems 2 
1,2,1 Probability measures 

In the module "Likelihood" (Section 1.1) we introduce the notion of a basic space Q of all possible outcomes 
of a trial or experiment, events as subsets of the basic space determined by appropriate characteristics of 
the outcomes, and logical or Boolean combinations of the events (unions, intersections, and complements) 
corresponding to logical combinations of the defining characteristics. 

Occurrence or nonoccurrence of an event is determined by characteristics or attributes of the outcome 
observed on a trial. Performing the trial is visualized as selecting an outcome from the basic set. An 
event occurs whenever the selected outcome is a member of the subset representing the event. As described 
so far, the selection process could be quite deliberate, with a prescribed outcome, or it could involve the 
uncertainties associated with "chance." Probability enters the picture only in the latter situation. Before the 
trial is performed, there is uncertainty about which of these latent possibilities will be realized. Probability 
traditionally is a number assigned to an event indicating the likelihood of the occurrence of that event on 
any trial. 

We begin by looking at the classical model which first successfully formulated probability ideas in math- 
ematical form. We use modern terminology and notation to describe it. 

Classical probability 

1. The basic space Q, consists of a finite number JV of possible outcomes. 
- There are thirty six possible outcomes of throwing two dice. 
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10 CHAPTER 1. PROBABILITY SYSTEMS 

- There are C (52, 5) = -^^ = 2598960 different hands of five cards (order not important). 

- There are 2 5 = 32 results (sequences of heads or tails) of flipping five coins. 

2. Each possible outcome is assigned a probability 1/N 

3. If event (subset) A has JV^ elements, then the probability assigned event A is 

P(A) = N A /N (i.e., the fraction favorable to A) (1.13) 

With this definition of probability, each event A is assigned a unique probability, which may be determined 
by counting N A , the number of elements in A (in the classical language, the number of outcomes "favorable" 
to the event) and JV the total number of possible outcomes in the sure event ft. 

Example 1.3: Probabilities for hands of cards 

Consider the experiment of drawing a hand of five cards from an ordinary deck of 52 playing cards. 
The number of outcomes, as noted above, is N = C(52,5) = 2598960. What is the probability 
of drawing a hand with exactly two aces? What is the probability of drawing a hand with two or 
more aces? What is the probability of not more than one ace? 

SOLUTION 

Let A be the event of exactly two aces, B be the event of exactly three aces, and C be the event 
of exactly four aces. In the first problem, we must count the number JV^ of ways of drawing a hand 
with two aces. We select two aces from the four, and select the other three cards from the 48 non 
aces. Thus 

1\T 1 OQ'"7'"7 d 

N A = C (4, 2) C (48, 3) = 103776, so that P (A) = — - = » 0.0399 (1.14) 

*> ' i *> ' i ' y ' N 2598960 v ' 

There are two or more aces iff there are exactly two or exactly three or exactly four. Thus the 
event D of two or more is D = A\J B\J C. Since A, B, C are mutually exclusive, 

N D = N A + N B + N c = C (4, 2) C (48, 3) + C (4, 3) C (48, 2) + C (4, 4) C (48, 1) = (1.15) 
103776 + 4512 + 48 = 108336 

so that P (D) w 0.0417. There is one ace or none iff there are not two or more aces. We thus 
want P (D c ). Now the number in D c is the number not in D which is N — Njj, so that 

P (D c ) = — — = 1--^ = 1-P(£>) = 0.9583 (1.16) 

— □ 

This example illustrates several important properties of the classical probability. 

1. P (A) = Na/N is a nonnegative quantity. 

2. P (ft) = N/N = 1 

3. If A, B, C are mutually exclusive, then the number in the disjoint union is the sum of the numbers in 
the individual events, so that 

P U\J B\J C\ = P {A) + P (B) + P (C) (1.17) 

Several other elementary properties of the classical probability may be identified. It turns out that they can 
be derived from these three. Although the classical model is highly useful, and an extensive theory has been 
developed, it is not really satisfactory for many applications (the communications problem, for example). 
We seek a more general model which includes classical probability as a special case and is thus an extension 
of it. We adopt the Kolmogorov model (introduced by the Russian mathematician A. N. Kolmogorov) which 
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captures the essential ideas in a remarkably successful way. Of course, no model is ever completely successful. 
Reality always seems to escape our logical nets. 

The Kolmogorov model is grounded in abstract measure theory. A full explication requires a level of 
mathematical sophistication inappropriate for a treatment such as this. But most of the concepts and many 
of the results are elementary and easily grasped. And many technical mathematical considerations are not 
important for applications at the level of this introductory treatment and may be disregarded. We borrow 
from measure theory a few key facts which are either very plausible or which can be understood at a practical 
level. This enables us to utilize a very powerful mathematical system for representing practical problems in 
a manner that leads to both insight and useful strategies of solution. 

Our approach is to begin with the notion of events as sets introduced above, then to introduce probability 
as a number assigned to events subject to certain conditions which become definitive properties. Gradually 
we introduce and utilize additional concepts to build progressively a powerful and useful discipline. The 
fundamental properties needed are just those illustrated in Example 1.3 (Probabilities for hands of cards) 
for the classical case. 

Definition 

A probability system consists of a basic set 0, of elementary outcomes of a trial or experiment, a class of 
events as subsets of the basic space, and a probability measure P (■) which assigns values to the events in 
accordance with the following rules: 

(PI): For any event A, the probability P (A) > 0. 
(P2): The probability of the sure event P (ft) = 1. 

(P3): Countable additivity. If {Ai : 1 e J} is a mutually exclusive, countable class of events, then the 
probability of the disjoint union is the sum of the individual probabilities. 

The necessity of the mutual exclusiveness (disjointedness) is illustrated in Example 1.3 (Probabilities for 
hands of cards). If the sets were not disjoint, probability would be counted more than once in the sum. A 
probability, as defined, is abstract — simply a number assigned to each set representing an event. But we can 
give it an interpretation which helps to visualize the various patterns and relationships encountered. We may 
think of probability as mass assigned to an event. The total unit mass is assigned to the basic set ft. The 
additivity property for disjoint sets makes the mass interpretation consistent. We can use this interpretation 
as a precise representation. Repeatedly we refer to the probability mass assigned a given set. The mass 
is proportional to the weight, so sometimes we speak informally of the weight rather than the mass. Now 
a mass assignment with three properties does not seem a very promising beginning. But we soon expand 
this rudimentary list of properties. We use the mass interpretation to help visualize the properties, but are 
primarily concerned to interpret them in terms of likelihoods. 

(P4): P (A c ) = 1 - P (A). This follows from additivity and the fact that 

l = P(ft) = p(a\Ja c ) =P{A) + P{A c ) (1.18) 

(P5): P(0) = 0. The empty set represents an impossible event. It has no members, hence cannot occur. 
It seems reasonable that it should be assigned zero probability (mass). Since = ft c , this follows 
logically from (P4) ("(P4)", p. 11) and (P2) ("(P2)", p. 11). 
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Figure 1.1: Partitions of the union All B. 



(P6): If A C B, then P (A) < P (B). From the mass point of view, every point in A is also in B, so that B 
must have at least as much mass as A. Now the relationship A <Z B means that if A occurs, B must 
also. Hence B is at least as likely to occur as A. From a purely formal point of view, we have 



B 



A\j A C B so that P (B) = P {A) + P {A C B) > P {A) since P(A c B)>0 



(1.19) 



(P7) 



P(AUB) = P(A)+P (A C B) = P(B) + P {AB C ) = P {AB C ) + P {AB) + P (A C B) 

= P(A) + P (B) - P {AB) 
The first three expressions follow from additivity and partitioning of A U B as follows (see Figure 1.1) 



AUB = A\J A c B = B\J AB c = AB C \J AB\J A c B 



(1.20) 



If we add the first two expressions and subtract the third, we get the last expression. In terms of 
probability mass, the first expression says the probability in A U B is the probability mass in A plus 
the additional probability mass in the part of B which is not in A. A similar interpretation holds for 
the second expression. The third is the probability in the common part plus the extra in A and the 
extra in B. If we add the mass in A and B we have counted the mass in the common part twice. The 
last expression shows that we correct this by taking away the extra common mass. 
(P8): If {Bi : i S J} is a countable, disjoint class and A is contained in the union, then 



A=\J ABi so that P(A) = ^P(AB l ) 



(1.21) 



ieJ 



ieJ 
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(P9): Subadditivity. If A = |J^i A u then P (A) < Y,T=i p { A i)- This follows from countable additivity, 
property (P6) ("(P6)", p. 12), and the fact ( Partitions ) 

DO OO 

A = [j Ai = \/ B i: where B t = AiA\A% ■ ■ ■ A\_ x c M (1.22) 

This includes as a special case the union of a finite number of events. 

Some of these properties, such as (P4) ("(P4)", p. 11), (P5) ("(P5)", p. 11), and (P6) ("(P6)", p. 12), are 
so elementary that it seems they should be included in the defining statement. This would not be incorrect, 
but would be inefficient. If we have an assignment of numbers to the events, we need only establish (PI) 
("(PI)", p. 11), (P2) ("(P2)", p. 11), and (P3) ("(P3)", p. 11) to be able to assert that the assignment 
constitutes a probability measure. And the other properties follow as logical consequences. 
Flexibility at a price 

In moving beyond the classical model, we have gained great flexibility and adaptability of the model. 
It may be used for systems in which the number of outcomes is infinite (countably or uncountably). It 
does not require a uniform distribution of the probability mass among the outcomes. For example, the 
dice problem may be handled directly by assigning the appropriate probabilities to the various numbers of 
total spots, 2 through 12. As we see in the treatment of conditional probability (Section 3.1), we make 
new probability assignments (i.e., introduce new probability measures) when partial information about the 
outcome is obtained. 

But this freedom is obtained at a price. In the classical case, the probability value to be assigned an event 
is clearly defined (although it may be very difficult to perform the required counting). In the general case, 
we must resort to experience, structure of the system studied, experiment, or statistical studies to assign 
probabilities. 

The existence of uncertainty due to "chance" or "randomness" does not necessarily imply that the act of 
performing the trial is haphazard. The trial may be quite carefully planned; the contingency may be the result 
of factors beyond the control or knowledge of the experimenter. The mechanism of chance (i.e., the source 
of the uncertainty) may depend upon the nature of the actual process or system observed. For example, in 
taking an hourly temperature profile on a given day at a weather station, the principal variations are not due 
to experimental error but rather to unknown factors which converge to provide the specific weather pattern 
experienced. In the case of an uncorrected digital transmission error, the cause of uncertainty lies in the 
intricacies of the correction mechanisms and the perturbations produced by a very complex environment. A 
patient at a clinic may be self selected. Before his or her appearance and the result of a test, the physician 
may not know which patient with which condition will appear. In each case, from the point of view of the 
experimenter, the cause is simply attributed to "chance." Whether one sees this as an "act of the gods" or 
simply the result of a configuration of physical or behavioral causes too complex to analyze, the situation is 
one of uncertainty, before the trial, about which outcome will present itself. 

If there were complete uncertainty, the situation would be chaotic. But this is not usually the case. 
While there is an extremely large number of possible hourly temperature profiles, a substantial subset of 
these has very little likelihood of occurring. For example, profiles in which successive hourly temperatures 
alternate between very high then very low values throughout the day constitute an unlikely subset (event). 
One normally expects trends in temperatures over the 24 hour period. Although a traffic engineer does not 
know exactly how many vehicles will be observed in a given time period, experience provides some idea what 
range of values to expect. While there is uncertainty about which patient, with which symptoms, will appear 
at a clinic, a physician certainly knows approximately what fraction of the clinic's patients have the disease 
in question. In a game of chance, analyzed into "equally likely" outcomes, the assumption of equal likelihood 
is based on knowledge of symmetries and structural regularities in the mechanism by which the game is 
carried out. And the number of outcomes associated with a given event is known, or may be determined. 

In each case, there is some basis in statistical data on past experience or knowledge of structure, regularity, 
and symmetry in the system under observation which makes it possible to assign likelihoods to the occurrence 
of various events. It is this ability to assign likelihoods to the various events which characterizes applied 
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probability. However determined, probability is a number assigned to events to indicate their likelihood of 
occurrence. The assignments must be consistent with the defining properties (PI) ("(PI)", p. 11), (P2) 
("(P2)", p. 11), (P3) ("(P3)", p. 11) along with derived properties (P4) through (P9) (p. 11) (plus others 
which may also be derived from these). Since the probabilities are not "built in," as in the classical case, a 
prime role of probability theory is to derive other probabilities from a set of given probabilites. 

1.3 Interpretations 3 
1.3.1 What is Probability? 

The formal probability system is a model whose usefulness can only be established by examining its structure 
and determining whether patterns of uncertainty and likelihood in any practical situation can be represented 
adequately. With the exception of the sure event and the impossible event, the model does not tell us how to 
assign probability to any given event. The formal system is consistent with many probability assignments, 
just as the notion of mass is consistent with many different mass assignments to sets in the basic space. 

The defining properties (PI) ("(PI)", p. 11), (P2) ("(P2)", p. 11), (P3) ("(P3)", p. 11) and derived 
properties provide consistency rules for making probability assignments. One cannot assign negative proba- 
bilities or probabilities greater than one. The sure event is assigned probability one. If two or more events 
are mutually exclusive, the total probability assigned to the union must equal the sum of the probabilities 
of the separate events. Any assignment of probability consistent with these conditions is allowed. 

One may not know the probability assignment to every event. Just as the defining conditions put 
constraints on allowable probability assignments, they also provide important structure. A typical applied 
problem provides the probabilities of members of a class of events (perhaps only a few) from which to 
determine the probabilities of other events of interest. We consider an important class of such problems in 
the next chapter. 

There is a variety of points of view as to how probability should be interpreted. These impact the manner 
in which probabilities are assigned (or assumed). One important dichotomy among practitioners. 

• One group believes probability is objective in the sense that it is something inherent in the nature of 
things. It is to be discovered, if possible, by analysis and experiment. Whether we can determine it or 
not, "it is there." 

• Another group insists that probability is a condition of the mind of the person making the probability 
assessment. From this point of view, the laws of probability simply impose rational consistency upon 
the way one assigns probabilities to events. Various attempts have been made to find objective ways 
to measure the strength of one's belief or degree of certainty that an event will occur. The probability 
P (A) expresses the degree of certainty one feels that event A will occur. One approach to characterizing 
an individual's degree of certainty is to equate his assessment of P (A) with the amount a he is willing 
to pay to play a game which returns one unit of money if A occurs, for a gain of (1 — a), and returns 
zero if A does not occur, for a gain of —a. Behind this formulation is the notion of a fair game, in 
which the "expected" or "average" gain is zero. 

The early work on probability began with a study of relative frequencies of occurrence of an event under 
repeated but independent trials. This idea is so imbedded in much intuitive thought about probability that 
some probabilists have insisted that it must be built into the definition of probability. This approach has not 
been entirely successful mathematically and has not attracted much of a following among either theoretical or 
applied probabilists. In the model we adopt, there is a fundamental limit theorem, known as Borel's theorem, 
which may be interpreted "if a trial is performed a large number of times in an independent manner, the 
fraction of times that event A occurs approaches as a limit the value P (A). Establishing this result (which 
we do not do in this treatment) provides a formal validation of the intuitive notion that lay behind the 
early attempts to formulate probabilities. Inveterate gamblers had noted long-run statistical regularities, 



3 This content is available online at <http://cnx.Org/content/m23246/l.7/>. 
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and sought explanations from their mathematically gifted friends. From this point of view, probability is 
meaningful only in repeatable situations. Those who hold this view usually assume an objective view of 
probability. It is a number determined by the nature of reality, to be discovered by repeated experiment. 

There are many applications of probability in which the relative frequency point of view is not feasible. 
Examples include predictions of the weather, the outcome of a game or a horse race, the performance of an 
individual on a particular job, the success of a newly designed computer. These are unique, nonrepeatable 
trials. As the popular expression has it, "You only go around once." Sometimes, probabilities in these 
situations may be quite subjective. As a matter of fact, those who take a subjective view tend to think 
in terms of such problems, whereas those who take an objective view usually emphasize the frequency 
interpretation. 

Example 1.4: Subjective probability and a football game 

The probability that one's favorite football team will win the next Superbowl Game may well 
be only a subjective probability of the bettor. This is certainly not a probability that can be 
determined by a large number of repeated trials. The game is only played once. However, the 
subjective assessment of probabilities may be based on intimate knowledge of relative strengths 
and weaknesses of the teams involved, as well as factors such as weather, injuries, and experience. 
There may be a considerable objective basis for the subjective assignment of probability. In fact, 
there is often a hidden "frequentist" element in the subjective evaluation. There is an assessment 
(perhaps unrealized) that in similar situations the frequencies tend to coincide with the value 
subjectively assigned. 

Example 1.5: The probability of rain 

Newscasts often report that the probability of rain of is 20 percent or 60 percent or some other 
figure. There are several difficulties here. 

• To use the formal mathematical model, there must be precision in determining an event. 
An event either occurs or it does not. How do we determine whether it has rained or not? 
Must there be a measurable amount? Where must this rain fall to be counted? During what 
time period? Even if there is agreement on the area, the amount, and the time period, there 
remains ambiguity: one cannot say with logical certainty the event did occur or it did not 
occur. Nevertheless, in this and other similar situations, use of the concept of an event may be 
helpful even if the description is not definitive. There is usually enough practical agreement 
for the concept to be useful. 

• What does a 30 percent probability of rain mean? Does it mean that if the prediction is correct, 
30 percent of the area indicated will get rain (in an agreed amount) during the specified time 
period? Or does it mean that 30 percent of the occasions on which such a prediction is made 
there will be significant rainfall in the area during the specified time period? Again, the latter 
alternative may well hide a frequency interpretation. Does the statement mean that it rains 
30 percent of the times when conditions are similar to current conditions? 

Regardless of the interpretation, there is some ambiguity about the event and whether it has 
occurred. And there is some difficulty with knowing how to interpret the probability figure. While 
the precise meaning of a 30 percent probability of rain may be difficult to determine, it is generally 
useful to know whether the conditions lead to a 20 percent or a 30 percent or a 40 percent probability 
assignment. And there is no doubt that as weather forecasting technology and methodology continue 
to improve the weather probability assessments will become increasingly useful. 

Another common type of probability situation involves determining the distribution of some characteristic 
over a population — usually by a survey. These data are used to answer the question: What is the probability 
(likelihood) that a member of the population, chosen "at random" (i.e., on an equally likely basis) will have 
a certain characteristic? 
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Example 1.6: Empirical probability based on survey data 

A survey asks two questions of 300 students: Do you live on campus? Are you satisfied with 
the recreational facilities in the student center? Answers to the latter question were categorized 
"reasonably satisfied," "unsatisfied," or "no definite opinion." Let C be the event "on campus;" O 
be the event "off campus;" S be the event "reasonably satisfied;" U be the event "unsatisfied;" and 
JV be the event "no definite opinion." Data are shown in the following table. 
Survey Data 

Survey Data 





s 


U 


N 


c 


127 


31 


42 





46 


43 


11 



Table 1.1 

If an individual is selected on an equally likely basis from this group of 300, the probability of any 
of the events is taken to be the relative frequency of respondents in each category corresponding 
to an event. There are 200 on campus members in the population, so P (C) = 200/300 and 
P (O) = 100/300. The probability that a student selected is on campus and satisfied is taken to be 
P (CS) = 127/300. The probability a student is either on campus and satisfied or off campus and 
not satisfied is 

P (CS \J OU) = P {CS) + P (OU) = 127/300 + 43/300 = 170/300 (1.23) 

If there is reason to believe that the population sampled is representative of the entire student 
body, then the same probabilities would be applied to any student selected at random from the 
entire student body. 

It is fortunate that we do not have to declare a single position to be the "correct" viewpoint and interpretation. 
The formal model is consistent with any of the views set forth. We are free in any situation to make the 
interpretation most meaningful and natural to the problem at hand. It is not necessary to fit all problems 
into one conceptual mold; nor is it necessary to change mathematical model each time a different point of 
view seems appropriate. 



1,3.2 Probability and odds 

Often we find it convenient to work with a ratio of probabilities. If A and B are events with positive 
probability the odds favoring A over B is the probability ratio P (A) jP (B). If not otherwise specified, B is 
taken to be A c and we speak of the odds favoring A 



O(A) 



P(A) 



P(A) 



P{A C ) l-P(A) 
This expression may be solved algebraically to determine the probability from the odds 

O(A) 



In particular, if O (A) = a/b then P (A) 



P(A) 

a/b 



1 + O (A) 



l+a/b a+b " 

O (A) = 0.7/0.3 = 7/3. If the odds favoring A is 5/3, then P (A) = 5/ (5 + 3) = 5/8. 



(1.24) 



(1.25) 



17 

1.3.3 Partitions and Boolean combinations of events 

The countable additivity property (P3) ("(P3)", p. 11) places a premium on appropriate partitioning of 
events. 

Definition. A partition is a mutually exclusive class 

{Ai-.ie J} such that O = \J A t (1.26) 

ie.7 

A partition of event A is a mutually exclusive class 

{Ai-.ie J} such that A = \J A t (1.27) 

Remarks. 

• A partition is a mutually exclusive class of events such that one (and only one) must occur on each 
trial. 

• A partition of event A is a mutually exclusive class of events such that A occurs iff one (and only one) 
of the Aj occurs. 

• A partition (no qualifier) is taken to be a partition of the sure event 57. 

• If class {Bi : fi e J} is mutually exclusive and A C B = V Bi, then the class {ABi : B e J} is a 

ieJ 

partition of A and A = V ABi. 

ie.J 

We may begin with a sequence {Ai : 1 < i} and determine a mutually exclusive (disjoint) sequence {Bi : 
1 < i} as follows: 

B x = Ai, and for any i > 1, B % = AiA\A% ■ ■ ■ A\_ x (1.28) 

Thus each B\ is the set of those elements of A; not in any of the previous members of the sequence. 

This representation is used to show that subadditivity (P9) ("(P9)", p. 12) follows from countable 
additivity and property (P6) ("(P6)", p. 12). Since each Bi C A h by (P6) ("(P6)", p. 12) P (Bi) < P(AA. 
Now 

Coo \ / oo \ oo oo 

\jA t \=P (\jB t =£>(£,) <£ P W (1.29) 

i=l ) \i=\ ) i=l i=\ 

The representation of a union as a disjoint union points to an important strategy in the solution of probability 
problems. If an event can be expressed as a countable disjoint union of events, each of whose probabilities is 
known, then the probability of the combination is the sum of the individual probailities. In in the module on 
Partitions and Minterms (Section 2.1.2: Partitions and minterms), we show that any Boolean combination 
of a finite class of events can be expressed as a disjoint union in a manner that often facilitates systematic 
determination of the probabilities. 

1.3.4 The indicator function 

One of the most useful tools for dealing with set combinations (and hence with event combinations) is the 
indicator function Ie for a set E C Vt. It is defined very simply as follows: 

f , 1 for u € E „ 

Ie (w) = { (1-30) 

for u) e E c 
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Remark. Indicator fuctions may be defined on any domain. We have occasion in various cases to define 
them on the real line and on higher dimensional Euclidean spaces. For example, if M is the interval [a, b] 
on the real line then Im (t) = 1 for each t in the interval (and is zero otherwise) . Thus we have a step 
function with unit value over the interval M. In the abstract basic space $7 we cannot draw a graph so easily. 
However, with the representation of sets on a Venn diagram, we can give a schematic representation, as in 
Figure 1.2. 




Figure 1.2: Representation of the indicator function ig for event E. 



Much of the usefulness of the indicator function comes from the following properties. 

(IF1): I A < Ib iff A C B. If I A < I B , then w £ A implies I A (w) = I B (w) = 1, so u> £ B. If A C B, then 

I A (w) = 1 implies id £ A implies uj £ B implies I B (w) = 1. 
(IF2): I A = I B iff A = B 



A = B iff both A C B and B C A iff I A < I B and I B < I A iff I A = I B 



(1.31) 



(IF3): I A c = 1 - I A This follows from the fact I A c (w) = 1 iff I A (w) = 0. 

(IF4): I^B = Ia^b = min{I A , I B } (extends to any class) An element u> belongs to the intersection iff it 
belongs to all iff the indicator function for each event is one iff the product of the indicator functions 
is one. 

(IF5): I AuB = I A + I B — I A I B = max{I A , I B } (the maximum rule extends to any class) The maximum 
rule follows from the fact that u> is in the union iff it is in any one or more of the events in the union iff 
any one or more of the individual indicator function has value one iff the maximum is one. The sum 
rule for two events is established by DeMorgan's rule and properties (IF2), (IF3), and (IF4). 



'AUB 



1 - I A c B c = 1 - [1 - I A ] [1 - I B ] = 1 - 1 + I B + I A - I A h 



(1.32) 



(IF6): If the pair {A, B} is disjoint, Ia\j b = Ia + Ib (extends to any disjoint class) 

The following example illustrates the use of indicator functions in establishing relationships between set 
combinations. Other uses and techniques are established in the module on Partitions and Minterms (Sec- 
tion 2.1.2: Partitions and minterms). 
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Example 1.7: Indicator functions and set combinations 

Suppose {Ai : 1 < i < n) is a partition. 

n n 

If B = \J A t C,, then B c = \J A t C c , (1.33) 

VERIFICATION 

Utilizing properties of the indicator function established above, we have 

n 

Ib = Y, I aJc 1 (1-34) 

Note that since the A; form a partition, we have J2"=i Ia* = 1, so that the indicator function for 
the complementary event is 

n n n n n 

i B « = i - X Ia ^ = X ** - X ^m = X ^ [i - ic<] = X wc,° ( L35 ) 

i— 1 i— 1 i— 1 i— 1 i— 1 

n 

The last sum is the indicator function for V AjC?. 

1.3.5 A technical comment on the class of events 

The class of events plays a central role in the intuitive background, the application, and the formal math- 
ematical structure. Events have been modeled as subsets of the basic space of all possible outcomes of the 
trial or experiment. In the case of a finite number of outcomes, any subset can be taken as an event. In the 
general theory, involving infinite possibilities, there are some technical mathematical reasons for limiting the 
class of subsets to be considered as events. The practical needs are these: 

1. If A is an event, its complementary set must also be an event. 

2. If {Ai : i e J} is a finite or countable class of events, the union and the intersection of members of the 
class need to be events. 

A simple argument based on DeMorgan's rules shows that if the class contains complements of all its sets 
and countable unions, then it contains countable intersections. Likewise, if it contains complements of all its 
sets and countable intersections, then it contains countable unions. A class of sets closed under complements 
and countable unions is known as a sigma algebra of sets. In a formal, measure-theoretic treatment, a basic 
assumption is that the class of events is a sigma algebra and the probability measure assigns probabilities to 
members of that class. Such a class is so general that it takes very sophisticated arguments to establish the 
fact that such a class does not contain all subsets. But precisely because the class is so general and inclusive 
in ordinary applications we need not be concerned about which sets are permissible as events 

A primary task in formulating a probability problem is identifying the appropriate events and the rela- 
tionships between them. The theoretical treatment shows that we may work with great freedom in forming 
events, with the assurrance that in most applications a set so produced is a mathematically valid event. 
The so called measurability question only comes into play in dealing with random processes with continuous 
parameters. Even there, under reasonable assumptions, the sets produced will be events. 

1.4 Problems on Probability Systems 4 

Exercise 1.1 (Solution on p. 23.) 

Let Q, consist of the set of positive integers. Consider the subsets 



4 This content is available online at <http://cnx.Org/content/m24071/l.4/>. 
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A = {uj : uj < 12} B = {uj : to < 8} C = {uj : uj is even} 

D = {uj : uj is a multiple of 3} E = {uj : to is a multiple of 4} 

Describe in terms of A, B, C, D, E and their complements the following sets: 

a. {1,3,5,7} 

b. {3, 6, 9} 

c. {8, 10} 

d. The even integers greater than 12. 

e. The positive integers which are multiples of six. 

f. The integers which are even and no greater than 6 or which are odd and greater than 12. 

Exercise 1.2 (Solution on p. 23.) 

Let Q, be the set of integers through 10. Let A = {5, 6, 7, 8}, B = the odd integers in Q,, and 
C = the integers in Q, which are even or less than three. Describe the following sets by listing their 
elements. 



a. 


AB 


b. 


AC 


c. 


AB C UC 


d. 


ABC C 


c. 


AuB c 


f. 


AUBC C 


g- 


ABC 


h. 


A C BC C 



Exercise 1.3 (Solution on p. 23.) 

Consider fifteen-word messages in English. Let A = the set of such messages which contain the 
word "bank" and let B = the set of messages which contain the word "bank" and the word "credit." 
Which event has the greater probability? Why? 

Exercise 1.4 (Solution on p. 23.) 

A group of five persons consists of two men and three women. They are selected one-by-one in a 
random manner. Let E; be the event a man is selected on the ith selection. Write an expression 
for the event that both men have been selected by the third selection. 

Exercise 1.5 (Solution on p. 23.) 

Two persons play a game consecutively until one of them is successful or there are ten unsuccessful 
plays. Let E; be the event of a success on the ith play of the game. Let A, B, C be the respective 
events that player one, player two, or neither wins. Write an expression for each of these events in 
terms of the events E;, 1 < i < 10. 

Exercise 1.6 (Solution on p. 23.) 

Suppose the game in Exercise 1.5 could, in principle, be played an unlimited number of times. 
Write an expression for the event D that the game will be terminated with a success in a finite 
number of times. Write an expression for the event E that the game will never terminate. 

Exercise 1.7 (Solution on p. 23.) 

Find the (classical) probability that among three random digits, with each digit (0 through 9) 
being equally likely and each triple equally likely: 

a. All three are alike. 

b. No two are alike. 

c. The first digit is 0. 

d. Exactly two are alike. 
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Exercise 1.8 (Solution on p. 23.) 

The classical probability model is based on the assumption of equally likely outcomes. Some care 
must be shown in analysis to be certain that this assumption is good. A well known example is the 
following. Two coins are tossed. One of three outcomes is observed: Let u>i be the outcome both 
are "heads," 102 the outcome that both are "tails," and 0J3 be the outcome that they are different. 
Is it reasonable to suppose these three outcomes are equally likely? What probabilities would you 
assign? 

Exercise 1.9 (Solution on p. 23.) 

A committee of five is chosen from a group of 20 people. What is the probability that a specified 
member of the group will be on the committee? 

Exercise 1.10 (Solution on p. 23.) 

Ten employees of a company drive their cars to the city each day and park randomly in ten spots. 
What is the (classical) probability that on a given day Jim will be in place three? There are n\ 
equally likely ways to arrange n items (order important). 

Exercise 1.11 (Solution on p. 23.) 

An extension of the classical model involves the use of areas. A certain region L (say of land) is 
taken as a reference. For any subregion A, define P (A) = area (A) /area (L). Show that P (■) is a 
probability measure on the subregions of L. 

Exercise 1.12 (Solution on p. 23.) 

John thinks the probability the Houston Texans will win next Sunday is 0.3 and the probability 
the Dallas Cowboys will win is 0.7 (they are not playing each other). He thinks the probability both 
will win is somewhere between — say, 0.5. Is that a reasonable assumption? Justify your answer. 

Exercise 1.13 (Solution on p. 23.) 

Suppose P (A) = 0.5 and P (B) = 0.3. What is the largest possible value of P (AB)? Using 
the maximum value of P {AB), determine P {AB C ), P {A C B), P {A C B C ) and P {A U B). Are these 
values determined uniquely? 

Exercise 1.14 (Solution on p. 24.) 

For each of the following probability "assignments", fill out the table. Which assignments are not 
permissible? Explain why, in each case. 



P(A) 


P(B) 


P(AB) 


P{AuB) 


P{AB C ) 


P{A C B) 


P(A) + P (B) 


0.3 


0.7 


0.4 










0.2 


0.1 


0.4 










0.3 


0.7 


0.2 










0.3 


0.5 













0.3 


0.8 














Table 1.2 



Exercise 1.15 (Solution on p. 24.) 

The class {A, B, C} of events is a partition. Event A is twice as likely as C and event B is as 
likely as the combination A or C. Determine the probabilities P (A) , P (B) , P (C). 

Exercise 1.16 (Solution on p. 24.) 

Determine the probability P (A\J B U C) in terms of the probabilities of the events A, B, C and 
their intersections. 

Exercise 1.17 (Solution on p. 24.) 

If occurrence of event A implies occurrence of B, show that P (A C B) = P (B) — P (A). 
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Exercise 1.18 (Solution on p. 24.) 

Show that P (AB) > P {A) + P (B) - 1. 

Exercise 1.19 (Solution on p. 24.) 

The set combination A © B = AB C V A C B is known as the disjunctive union or the symetric 
difference of A and B. This is the event that only one of the events A or B occurs on a trial. 
Determine P (A © B) in terms of P {A), P (B), and P {AB). 

Exercise 1.20 (Solution on p. 24.) 

Use fundamental properties of probability to show 

a. P {AB) < P {A) < P {All B) < P {A) + P (B) 

b. p (n°ii ej) < p (Ei) < p (u?= i e 3 ) < e°° =1 p m 

Exercise 1.21 (Solution on p. 24.) 

Suppose Pi, P2 are probability measures and ci, C2 are positive numbers such that c\ + c^ = 1. 
Show that the assignment P (E) = c^P\ (E) + C2-P2 (E) to the class of events is a probability 
measure. Such a combination of probability measures is known as a mixture. Extend this to 

n n 

P (E) = 2_, c iPi (E) , where the Pi are probabilities measures, a > 0, and /c, = 1 (1.36) 

i=i i=i 

Exercise 1.22 (Solution on p. 24.) 

Suppose {Ai, A2, • • • , An] is a partition and {ci, C2, • • • , c n } is a class of positive constants. For 
each event E, let 

n n 

Q{E) = Y J c l P{EA l )/Y J c l P{A l ) (1.37) 

i=i i=i 

Show that Q {■) us a probability measure. 
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Solutions to Exercises in Chapter 1 

Solution to Exercise 1.1 (p. 19) 

a = BC c , b=DAE c , c=CAB c D c , d=CA c 1 e = CD, f = BC\J A C C C 
Solution to Exercise 1.2 (p. 20) 

a. AB = {5,7} 

b. AC = {6, 8} 

c. AB C UC=C 

d. ABC C = AB 

e. AuB c = {0,2,4,5,6,7,8,10} 

f. ABC = 

g. A C BC C = {3,9} 

Solution to Exercise 1.3 (p. 20) 

B C A implies P (B) < P {A). 
Solution to Exercise 1.4 (p. 20) 

A = E t E 2 V E X EIE Z V E1E 2 E 3 
Solution to Exercise 1.5 (p. 20) 

A T? \ I T? C T? C TP \ I J? c T? c T? c T? c T? \ I T? c T? c T? c T? c T? c T? c T? \ I T? c T? c T? c T? c T? c T? c T? c T? c T? ( 1 QQ^ 

A — t,\ y £j 1 £j 2 £ j z \J %£/ 2 £/3£/ 4 i±/ 5 Y t J1 £ J2 £ J3 £ JA £ Jb h J& h J 7 \J i/j^i/g^Jigig^iigJig (1.38) 

id rpc rp \ I rpc rpc rpc rp \ I rpc rpc rpc rpc rpc rp \ I rpc rpc jpc jpc jpc jpc jpc jp \ / jpc jpc jpc jpc jpc jpc jpc jpc jpc jp 

n — -C/x-C/2 V ^i^2^ / 3 £/ 4 V -^ 1-^2 -^3 -^4 ^5 ^6 V £/ l £/ 2 £/ 3 £/ 4 £/ 5 £/ 6 £/ 7- t/ 8 V ^l^l^Z*^ '4 -^5 -^6 -^7 ^S^Q^W 

C = rt Q =1 E? ' ' 

Solution to Exercise 1.6 (p. 20) 

Let F = ft and F k = f|Li ^ for k > l - Then 

CO CO 

D = \/ F n _i£ n and F = £ c = f| E c t (1.39) 

n— 1 i— 1 

Solution to Exercise 1.7 (p. 20) 

Each triple has probability 1/10 3 = 1/1000 

a. Ten triples, all alike: P = 10/1000. 

b. 10x9x8 triples all different: P = 720/1000. 

c. 100 triples with first one zero: P = 100/1000 

d. C (3, 2) = 3 ways to pick two positions alike; 10 ways to pick the common value; 9 ways to pick the 
other. P = 270/1000. 

Solution to Exercise 1.8 (p. 21) 

P ({wi}) = P ({wa}) = 1/4, P ({cu 3 } = 1/2 . 
Solution to Exercise 1.9 (p. 21) 

C (20, 5) committees; C(19,4) have a designated member. 

19! 5!15! , , 

P=- — - = 5/20 = 1/4 1.40 

4!15! 20! ' ' y ' 

Solution to Exercise 1.10 (p. 21) 

10! permutations. 1x9! permutations with Jim in place 3. P = 9!/10! = 1/10. 
Solution to Exercise 1.11 (p. 21) 

Additivity follows from additivity of areas of disjoint regions. 
Solution to Exercise 1.12 (p. 21) 

P (AB) = 0.5is not reasonable. It must no greater than the minimum of P (A) = 0.3 and P (B) = 0.7. 
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Solution to Exercise 1.13 (p. 21) 

Draw a Venn diagram, or use algebraic expressions P (AB C 



P(A C B) = P(B)-P(AB)=Q P{A C B C 
Solution to Exercise 1.14 (p. 21) 



P(A C 



= P{A) -P{AB) = 0.2 
P(A C B) = 0.5 P{AUB) 



0.5 



(1.41) 



P(A) 


P(B) 


P{AB) 


P(AUB) 


P{AB C ) 


P {A C B) 


P (A) + P (B) 


0.3 


0.7 


0.4 


0.6 


-0.1 


0.3 


1.0 


0.2 


0.1 


0.4 


-0.1 


-0.2 


-0.3 


0.3 


0.3 


0.7 


0.2 


0.8 


0.1 


0.5 


1.0 


0.3 


0.5 





0.8 


0.3 


0.5 


0.8 


0.3 


0.8 





1.1 


0.3 


0.8 


1.1 



Table 1.3 

Only the third and fourth assignments are permissible. 
Solution to Exercise 1.15 (p. 21) 

P{A) + P (B) + P (C) = 1, P {A) = 2P (C), and P (B) = P {A) + P (C) = 3P (C), which implies 

P(C) = 1/6, P(A) = 1/S, and P(B) = 1/2 (1.42) 

Solution to Exercise 1.16 (p. 21) 

P (AU B U C) = P (AU B) + P (C) - P {AC U BC) 

= P(A)+P(B)-P {AB) + P(C)-P (AC) - P (BC) + P (ABC) (1.43) 

Solution to Exercise 1.17 (p. 21) 

P (AB) = P (A) and P (AB) + P (A C B) = P (B) implies P (A C B) = P (B) - P (A). 
Solution to Exercise 1.18 (p. 22) 

Follows from P (A) + P (B) - P (AB) = P (A U B) < 1. 
Solution to Exercise 1.19 (p. 22) 

A Venn diagram shows P(A®B) = P (AB C ) + P (AB C ) = P (A) + P (B) - 2P (AB). 
Solution to Exercise 1.20 (p. 22) 

AB C Ac AUB implies P (AB) < P (A) < P (AU B) = P (A) + P (B) - P (AB) < P (A) + P (B). The 
general case follows similarly, with the last inequality determined by subadditivity. 
Solution to Exercise 1.21 (p. 22) 

Clearly P (E) > 0. P (Q) = a Pi (fi) + c 2 P 2 (ft) = 1. 

CO CO CO CO 

E=\J Ei implies P (E) = c^Pi (E l ) + c 2 ^P 2 (E t ) = ]TP (E t ) (1.44) 

i— 1 i— 1 z— 1 i— 1 

The pattern is the same for the general case, except that the sum of two terms is replaced by the sum of n 
terms CiPi (E). 
Solution to Exercise 1.22 (p. 22) 

Clearly Q (E) > and since Ajft = Ai we have Q (ft) = 1. If 



E=\/E k , then P{EA i ) = ^P{E k A i ) Vi 
fe=i fe=i 

Interchanging the order of summation shows that Q is countably additive. 



(1.45) 



Chapter 2 

Minterm Analysis 



2.1 Minterms 1 

2.1.1 Introduction 

A fundamental problem in elementary probability is to find the probability of a logical (Boolean) combination 
of a finite class of events, when the probabilities of certain other combinations are known. If we partition an 
event F into component events whose probabilities can be determined, then the additivity property implies 
the probability of F is the sum of these component probabilities. Frequently, the event F is a Boolean 
combination of members of a finite class- say, {^4, B, C} or {^4, B, C, D} . For each such finite class, there 
is a fundamental partition determined by the class. The members of this partition are called minterms. Any 
Boolean combination of members of the class can be expressed as the disjoint union of a unique subclass of 
the minterms. If the probability of every minterm in this subclass can be determined, then by additivity the 
probability of the Boolean combination is determined. We examine these ideas in more detail. 

2.1.2 Partitions and minterms 

To see how the fundamental partition arises naturally, consider first the partition of the basic space produced 
by a single event A. 



n = A\/A c 



(2.1) 



Now if B is a second event, then 



A=AB\/AB C and A c = A C B\J A C B C , so that fl = A C B C \J A C B\J AB C \J AB (2.2) 

The pair {A, B} has partitioned Q, into {A C B C , A C B, AB C , AB}. Continuation is this way leads systemat- 
ically to a partition by three events {A, B, C}, four events {A, B, C, D}, etc. 

We illustrate the fundamental patterns in the case of four events {A, B, C, D}. We form the minterms 
as intersections of members of the class, with various patterns of complementation. For a class of four events, 
there are 2 4 = 16 such patterns, hence 16 minterms. These are, in a systematic arrangement, 



A C B C C C D C 


A C BC C D C 


AB C C C D C 


ABC C D C 


A C B C C C D 


A C BC C D 


AB C C C D 


ABC C D 


A C B C C D c 


A C BC D c 


AB C C D c 


ABC D c 


A C B C C D 


A C BC D 


AB C C D 


ABC D 



1 This content is available online at <http://cnx.Org/content/m23247/l.7/>. 
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Table 2.1 

No element can be in more than one minterm, because each differs from the others by complementation 
of at least one member event. Each element u) is assigned to exactly one of the minterms by determining the 
answers to four questions: 

Is it in A? Is it in B? Is it in C? Is it in D? 

Suppose, for example, the answers are: Yes, No, No, Yes. Then uj is in the minterm AB C C C D. In a 
similar way, we can determine the membership of each uj in the basic space. Thus, the minterms form a 
partition. That is, the minterms represent mutually exclusive events, one of which is sure to occur on each 
trial. The membership of any minterm depends upon the membership of each generating set A, B, C or D, 
and the relationships between them. For some classes, one or more of the minterms are empty (impossible 
events). As we see below, this causes no problems. 

An examination of the development above shows that if we begin with a class of n events, there are 
2 n minterms. To aid in systematic handling, we introduce a simple numbering system for the minterms, 
which we illustrate by considering again the four events A, B, C, D , in that order. The answers to the four 
questions above can be represented numerically by the scheme 

No ~ and Yes ~ 1 

Thus, if u> is in A C B C C C D C , the answers are tabulated as 0. If w is in AB C C C D, then this is designated 
10 01. With this scheme, the minterm arrangement above becomes 



0000 - 


0100- 


.4 


1000- 


-8 


1100- 


- 12 


0001 ~ 1 


0101 - 


-5 


1001 - 


-9 


1101 - 


- 13 


0010-2 


0110- 


-6 


1010- 


- 10 


1110- 


„ 14 


0011 ~3 


0111 - 


-7 


1011 - 


- 11 


1111- 


- 15 



Table 2.2 

We may view these quadruples of zeros and ones as binary representations of integers, which may also 
be represented by their decimal equivalents, as shown in the table. Frequently, it is useful to refer to 
the minterms by number. If the members of the generating class are treated in a fixed order, then each 
minterm number arrived at in the manner above specifies a minterm uniquely. Thus, for the generating class 
{A, B, C, D}, in that order, we may designate 



A c B c C c D c = Mq ( minterm ) AB C C C D = M 9 (minterm 9), etc. 



(2.3) 



We utilize this numbering scheme on special Venn diagrams called minterm maps. These are illustrated in 
Figure 2.1, for the cases of three, four, and five generating events. Since the actual content of any minterm 
depends upon the sets A, B, C, and D in the generating class, it is customary to refer to these sets as 
variables. In the three- variable case, set A is the right half of the diagram and set C is the lower half; but set 
B is split, so that it is the union of the second and fourth columns. Similar splits occur in the other cases. 
Remark. Other useful arrangements of minterm maps are employed in the analysis of switching circuits. 
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B 




B 





2 


4 


6 


1 


3 


5 


7 



Three variables 

A- 

B 






4 


8 


12 


1 


5 


9 


13 


2 


6 


10 


14 


3 


7 


11 


15 



Four variables 






4 


8 


12 


16 


20 


24 


28 


1 


5 


9 


13 


17 


21 


25 


29 


2 


6 


10 


14 


18 


22 


26 


30 


3 


7 


11 


15 


19 


23 


27 


31 



Five variables 



F ijure 1 1 . M inteim m aps fiorthise, four, and five variabJes. 



Figure 2.1: Minterm maps for three, four, or five variables. 



2.1.3 Minterm maps and the minterm expansion 

The significance of the minterm partition of the basic space rests in large measure on the following fact. 

Minterm expansion 

Each Boolean combination of the elements in a generating class may be expressed as the disjoint union 
of an appropriate subclass of the minterms. This representation is known as the minterm expansion for the 
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combination. 

In deriving an expression for a given Boolean combination which holds for any class {A, B, C , D} of four 
events, we include all possible minterms, whether empty or not. If a minterm is empty for a given class, its 
presence does not modify the set content or probability assignment for the Boolean combination. 

The existence and uniqueness of the expansion is made plausible by simple examples utilizing minterm 
maps to determine graphically the minterm content of various Boolean combinations. Using the arrangement 
and numbering system introduced above, we let M; represent the ith minterm (numbering from zero) and 
let p (i) represent the probability of that minterm. When we deal with a union of minterms in a minterm 
expansion, it is convenient to utilize the shorthand illustrated in the following. 



M{l,Z,l) = M 1 \jM 3 \jM 7 and p (1, 3, 7) = p(l) + p (3) +p{7) 



(2.4) 






2 


4 


6 
1 


1 


3 


5 


7 


1 






1 



Figure 2.2: E = AB U A C (B U C c ) c = M (1,6,7) Minterm expansion for Example 2.1 ( Minterm 
expansion) 



Consider the following simple example. 

Example 2.1: Minterm expansion 

Suppose E = AB U A C (B\J C c ) c . Examination of the minterm map in Figure 2.2 shows that 
AB consists of the union of minterms Mq, Mr, which we designate M (6, 7). The combination 
BUC C = M (0, 2, 3, 4, 6, 7) , so that its complement (B U C c ) c = M (1, 5). This leaves the common 
part A C (B U C c f = M x . Hence, E = M (1, 6, 7). Similarly, F = Au B C C = M (1, 4, 5, 6, 7). 

A key to establishing the expansion is to note that each minterm is either a subset of the combination or is 
disjoint from it. The expansion is thus the union of those minterms included in the combination. A general 
verification using indicator functions is sketched in the last section of this module. 



2.1.4 Use of minterm maps 

A typical problem seeks the probability of certain Boolean combinations of a class of events when the 
probabilities of various other combinations is given. We consider several simple examples and illustrate the 
use of minterm maps in formulation and solution. 
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Example 2.2: Survey on software 

Statistical data are taken for a certain student population with personal computers. An individual 
is selected at random. Let A = the event the person selected has word processing, B = the event 
he or she has a spread sheet program, and C = the event the person has a data base program. The 
data imply 

• The probability is 0.80 that the person has a word processing program: P (A) = 0.8 

• The probability is 0.65 that the person has a spread sheet program: P (B) = 0.65 

• The probability is 0.30 that the person has a data base program: P (C) = 0.3 

• The probability is 0.10 that the person has all three: P (ABC) = 0.1 

• The probability is 0.05 that the person has neither word processing nor spread sheet: 
P(A C B C )= 0.05 

• The probability is 0.65 that the person has at least two: P (AB U AC U BC) = 0.65 

• The probability of word processor and data base, but no spread sheet is twice the probabilty 
of spread sheet and data base, but no word processor: P (AB C C) = 2P (A C BC) 

a. What is the probability that the person has exactly two of the programs? 

b. What is the probability that the person has only the data base program? 

Several questions arise: 

• Are these data consistent? 

• Are the data sufficient to answer the questions? 

• How may the data be utilized to anwer the questions? 

SOLUTION 

The data, expressed in terms of minterm probabilities, are: 

P(A)=p (4, 5, 6, 7) = 0.80; hence P (A c ) = p (0, 1, 2, 3) = 0.20 

P(B) = p (2,3,6,7) = 0.65; hence P (B c ) =p (0,1,4,5) = 0.35 

P(C) =p (1,3, 5, 7) = 0.30; hence P (C c ) = p (0, 2, 4, 6) = 0.70 

P (ABC) =p{7) = 0.10 P {A C B C ) = p (0, 1) = 0.05 

P {AB U AC U BC) = p (3, 5, 6, 7) = 0.65 

P (AB C C) = p (5) = 2p (3) = 2P (A C BC) 

These data are shown on the minterm map in Figure 3a (Figure 2.3). We use the patterns 
displayed in the minterm map to aid in an algebraic solution for the various minterm probabilities. 

p (2, 3) = p (0, 1, 2, 3) - p (0, 1) = 0.20 - 0.05 = 0.15 

p (6, 7)=p (2, 3, 6,7) -p (2, 3) = 0.65 - 0.15 = 0.50 

p (6) = p (6, 7) - p (7) = 0.50 - 0.10 = 0.40 

p (3, 5) = p (3, 5, 6,7) -p (6, 7) = 0.65 - 0.50 = 0.15 => p (3) = 0.05, 

p(5) = 0.10 => p(2) = 0.10 

p(l) =p (1,3, 5,7) -p (3,5)- p (7) = 0.30-0.15-0.10 = 0.05 => p(0)=0 

p (4) = p (4, 5, 6,7) -p (5) - p (6, 7) = 0.80 - 0.10 - 0.50 = 0.20 

Thus, all minterm probabilities are determined. They are displayed in Figure 3b (Figure 2.3). 
From these we get 

P (A C BC \f AB C C \J ABC C ) = p (3, 5, 6) = 0.05 + 0.10 + 0.40 = 0.55and P {A C B C C) = p (1) = 0.05 (2.5) 
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2 


4 


6 


1 


3 


5 


7 
0.10 



0.30 



0.05 



■ 0.80 ■ 



p(3,5,6,7) = 0.65 p(5) =2p(3) 
a. Data for software survey, Example 2.3.1 
A 

B B 







2 
0.10 


4 
0.20 


6 
0.40 


1 

0.05 


3 
0.05 


5 
0.10 


7 
0.10 



b. Minterrm probabilities for software survey, Example 3.3.1 



Figure 2.3: Minterm maps for software survey, Example 2.2 (Survey on software) 



Example 2.3: Survey on personal computers 

A survey of 1000 students shows that 565 have PC compatible desktop computers, 515 have Macin- 
tosh desktop computers, and 151 have laptop computers. 51 have all three, 124 have both PC and 
laptop computers, 212 have at least two of the three, and twice as many own both PC and laptop 
as those who have both Macintosh desktop and laptop. A person is selected at random from this 
population. What is the probability he or she has at least one of these types of computer? What 
is the probability the person selected has only a laptop? 
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2 


4 


6 


062 


0.376 


0.364 


077 


1 


3 


5 


7 


0016 


0.011 


0.073 


0,051 



Figure 2.4: Minterm probabilities for computer survey, Example 2.3 (Survey on personal computers) 



SOLUTION 

Let A = the event of owning a PC desktop, B = the event of owning a Macintosh desktop, and 
C = the event of owning a laptop. We utilize a minterm map for three variables to help determine 
minterm patterns. For example, the event AC = M5 V M7 so that P (AC) = p (5) +p (7) = p (5, 7). 

The data, expressed in terms of minterm probabilities, are: 

P(A)=p (4, 5, 6, 7) = 0.565, hence P (A c ) = p (0, 1, 2, 3) = 0.435 

P(B)=p (2, 3, 6, 7) = 0.515, hence P (B c ) = p (0, 1, 4, 5) = 0.485 

P(C) =: p(l,3,5,7) = 0.151, hence P (C c ) =p(0,2,4,6) = 0.849 

P (ABC) =p (7) = 0.051 P (AC) =p (5, 7) = 0.124 

P (AB UACU BC) = p (3, 5, 6, 7) = 0.212 

p (AC) = p (5, 7) = 2p (3, 7) = 2P (BC) 

We use the patterns displayed in the minterm map to aid in an algebraic solution for the various 
minterm probabilities. 

p (5) = p (5, 7) - p (7) = 0.124 - 0.051 = 0.073 

p (1, 3) = P (A C C) = 0.151 - 0.124 = 0.027 P (AC C ) = p (4, 6) = 0.565 - 0.124 = 0.441 

p (3, 7) = P (BC) = 0.124/2 = 0.062 

p(3) = 0.062-0.051 = 0.011 

p (6) = p (3, 4, 6, 7) - p (3) - p (5, 7) = 0.212 - 0.011 - 0.124 = 0.077 

p (4) = P (A) -p(6)-p (5, 7) = 0.565 - 0.077 - 0.1124 = 0.364 

p(l)=p (1,3) - p (3) = 0.027 - 0.11 = 0.016 

p (2) = P (B) -p(S,7)-p (6) = 0.515 - 0.062 - 0.077 = 0.376 

p (0) = P (C c ) -p(4,6)-p (2) = 0.849 - 0.441 - 0.376 = 0.032 

We have determined the minterm probabilities, which are displayed on the minterm map Fig- 
ure 2.4. We may now compute the probability of any Boolean combination of the generating events 
A, B, C. Thus, 



P (A U B U C) = 1 - P (A C B C C C ) = 1 - p (0) = 0.968 and P (A C B C C) = p (1) = 0.016 



(2.6) 
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Figure 2.5: Minterm probabilities for opinion survey, Example 2.4 (Opinion survey) 



Example 2.4: Opinion survey 

A survey of 1000 persons is made to determine their opinions on four propositions. Let A, B, C, D 
be the events a person selected agrees with the respective propositions. Survey results show the 
following probabilities for various combinations: 



P(A) = 0.200, P{B) = 0.500, P(C)= 0.300, P (D) = 0.700, P(A(Bl)C c )D c 



0.055 



(2.7) 



P(AuPCuP c ) = 0.520, P{A C BC C D) = 0.200, P (ABCD) = 0.015, P (AB C C) = 0.030 (2.6 



P(A C B C C C D) = 0.195, P(A C PC) = 0.120, P (A C B C D C ) = 0.120, P(AC C ) = 0.140 



(2.9) 



P (ACP C ) = 0.025, P (ABC C .D C 



0.020 



(2.10) 



Determine the probabilities for each minterm and for each of the following combinations 

A c {BC C U B C C) - that is, not A and (B or C, but not both) 

A U PC C - that is, A or (B and not C) 

SOLUTION 

At the outset, it is not clear that the data are consistent or sufficient to determine the minterm 
probabilities. However, an examination of the data shows that there are sixteen items (including the 
fact that the sum of all minterm probabilities is one). Thus, there is hope, but no assurance, that a 
solution exists. A step elimination procedure, as in the previous examples, shows that all minterms 
can in fact be calculated. The results are displayed on the minterm map in Figure 2.5. It would be 
desirable to be able to analyze the problem systematically. The formulation above suggests a more 
systematic algebraic formulation which should make possible machine aided solution. 
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2,1,5 Systematic formulation 

Use of a minterm map has the advantage of visualizing the minterm expansion in direct relation to the 
Boolean combination. The algebraic solutions of the previous problems involved ad hoc manipulations of 
the data minterm probability combinations to find the probability of the desired target combination. We 
seek a systematic formulation of the data as a set of linear algebraic equations with the minterm probabilities 
as unknowns, so that standard methods of solution may be employed. Consider again the software survey 
of Example 2.2 (Survey on software). 

Example 2.5: The software survey problem reformulated 

The data, expressed in terms of minterm probabilities, are: 

P(A) = p(4,5,6,7) = 0.80 

P(B) = p(2,3,6,7) = 0.65 

P(C) = p(l,3,5,7) = 0.30 

P (ABC) =p (7) = 0.10 

P(A C B C ) = p(0,l) = 0.05 

P (AB U AC U BC) = p (3, 5, 6, 7) = 0.65 

P (AB C C) = p (5) = 2p (3) = 2P {A C BC), so that p (5) - 2p (3) = 

We also have in any case 

P{n) = P{AU A c ) = p (0, 1, 2, 3, 4, 5, 6, 7) = 1 

to complete the eight items of data needed for determining all eight minterm probabilities. The 
first datum can be expressed as an equation in minterm probabilities: 



• p (0) + • p (1) + • p (2) + • p (3) + 1 • p (4) + 1 • p (5) + 1 • p (6) + 1 • p (7) = 0.80 
This is an algebraic equation in p (0) , ■ • • , p (7) with a matrix of coefficients 



(2.11) 



[00001111] 



(2.12) 



The others may be written out accordingly, giving eight linear algebraic equations in eight variables 
p(0) through p(7). Each equation has a matrix or vector of zero-one coefficients indicating which 
minterms are included. These may be written in matrix form as follows: 
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(2.13) 



• The patterns in the coefficient matrix are determined by logical operations. We obtained 
these with the aid of a minterm map. 

• The solution utilizes an algebraic procedure, which could be carried out in a variety of ways, 
including several standard computer packages for matrix operations. 

We show in the module Minterm Vectors and MATLAB (Section 2.2.1: Minterm vectors and 
MATLAB ) how we may use MATLAB for both aspects. 
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2.1.6 Indicator functions and the minterm expansion 

Previous discussion of the indicator function shows that the indicator function for a Boolean combination of 
sets is a numerical valued function of the indicator functions for the individual sets. 

• As an indicator function, it takes on only the values zero and one. 

• The value of the indicator function for any Boolean combination must be constant on each minterm. 
For example, for each u> in the minterm AB C CD C , we must have Ia (w) = 1, Ib (^) = 0, Ic (^) = 1, 
and Id (uj) = 0. Thus, any function of I a, Ib, Ic, Id must be constant over the minterm. 

• Consider a Boolean combination E of the generating sets. If u is in E n Mj, then Ie (to) = 1 for all 
u> € Mi, so that Mj C E. Since each w € Mj for some i, E must be the union of those minterms sharing 
an lu with E. 

• Let {Mi : i s Je} be the subclass of those minterms on which Ie has the value one. Then 

E=\J Mi (2.14) 

Je 

which is the minterm expansion of E. 



2.2 Minterms and MATLAB Calculations 2 

The concepts and procedures in this unit play a significant role in many aspects of the analysis of probability 
topics and in the use of MATLAB throughout this work. 

2.2.1 Minterm vectors and MATLAB 

The systematic formulation in the previous module Minterms (Section 2.1) shows that each Boolean com- 
bination, as a union of minterms, can be designated by a vector of zero-one coefficients. A coefficient one 
in the ith position (numbering from zero) indicates the inclusion of minterm M; in the union. We formulate 
this pattern carefully below and show how MATLAB logical operations may be utilized in problem setup 
and solution. 

Suppose E is a Boolean combination of A, B, C. Then, by the minterm expansion, 

E=\J Mi (2.15) 

Je 

where M; is the ith minterm and Je is the set of indices for those M; included in E. For example, consider 
E = A{BUC C )U A C {B U C c ) c = Mi \J M 4 \J M 6 \J M 7 = M (1, 4, 6, 7) (2.16) 

F = A C B C UAC = M \/ Mi \f M 5 \f M 7 = M (0, 1, 5, 7) (2.17) 

We may designate each set by a pattern of zeros and ones (eo, e\, ■■■ , e~i). The ones indicate which 
minterms are present in the set. In the pattern for set E, minterm M; is included in E iff e^ = 1. This 
is, in effect, another arrangement of the minterm map. In this form, it is convenient to view the pattern 
as a minterm vector, which may be represented by a row matrix or row vector [eo ei • • • e-j] . We find 
it convenient to use the same symbol for the name of the event and for the minterm vector or matrix 
representing it. Thus, for the examples above, 

E ~ [0 1001011] and F ~ [1100010 1] (2.18) 



2 This content is available online at <http://cnx.Org/content/m23248/l.8/>. 
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It should be apparent that this formalization can be extended to sets generated by any finite class. 
Minterm vectors for Boolean combinations 

If E and F are combinations of n generating sets, then each is represented by a unique minterm vector 
of length 2 n . In the treatment in the module Minterms (Section 2.1), we determine the minterm vector with 
the aid of a minterm map. We wish to develop a systematic way to determine these vectors. 

As a first step, we suppose we have minterm vectors for E and F and want to obtain the minterm vector 
of Boolean combinations of these. 

1. The minterm expansion for E U F has all the minterms in either set. This means the jth element of 
the vector for E U F is the maximum of the jth elements for the two vectors. 

2. The minterm expansion for EOF has only those minterms in both sets. This means the jth element 
of the vector for E n F is the minimum of the jth elements for the two vectors. 

3. The minterm expansion for E c has only those minterms not in the expansion for E. This means the 
vector for EF has zeros and ones interchanged. The jth element of EF is one iff the corresponding 
element of E is zero. 

We illustrate for the case of the two combinations E and F of three generating sets, considered above 

E = A{BUC C )U A C {B U C c ) c ~ [0 1 1 1 1] and F = A C B C U AC ~ [1 1 1 1] (2.19) 

Then 

BUF~ [11001111], £TlF~ [0 100000 1], and E c ~ [1 1 1 1 0] (2.20) 

MATLAB logical operations 

MATLAB logical operations on zero-one matrices provide a convenient way of handling Boolean combi- 
nations of minterm vectors represented as matrices. For two zero-one matrices E, F of the same size 

E\F is the matrix obtained by taking the maximum element in each place. 
E&F is the matrix obtained by taking the minimum element in each place. 
E c is the matrix obtained by interchanging one and zero in each place in E. 

Thus, if E, F are minterm vectors for sets by the same name, then E\F is the minterm vector for E U F, 
E&zF is the minterm vector for EOF, and E = 1 — E is the minterm vector for EF. 

This suggests a general approach to determining minterm vectors for Boolean combinations. 

1. Start with minterm vectors for the generating sets. 

2. Use MATLAB logical operations to obtain the minterm vector for any Boolean combination. 

Suppose, for example, the class of generating sets is {A, B, C}. Then the minterm vectors for A, B, and C, 
respectively, are 

A= [0 0001111] B= [0 0110011] C = [01010101] (2.21) 

If E = AB U C c , then the logical combination E = {A&B) | C of the matrices yields E = [10101011]. 
MATLAB implementation 

A key step in the procedure just outlined is to obtain the minterm vectors for the generating elements 
{^4, B, C}. We have an m-function to provide such fundamental vectors. For example to produce the 
second minterm vector for the family (i.e., the minterm vector for B), the basic zero-one pattern 1 1 is 
replicated twice to give 

00110011 

The function minterm(n,k) generates the irth minterm vector for a class of n generating sets. 

Example 2.6: Minterms for the class {A, B,C}. 









1 
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1 
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3> A = minterm(3,l) 
A = 

3> B = minterm(3,2) 
B = 1 1 

3> C = minterm(3,3) 
C = 1 1 

Example 2.7: Minterm patterns for the Boolean combinations 

F = ABUB C C G=AUA C C 

F = (A&B) I (~B&C) 
F=0 1 1 1 1 

> G = A I (~A&C) 

G=0 1 1 1 1 1 1 

3> JF = find(F)-l '/, Use of find to determine index set for F 

JF = 1 5 6 7 '/. Shows F = M(l, 5, 6, 7) 

These basic minterm patterns are useful not only for Boolean combinations of events but also for many 
aspects of the analysis of those random variables which take on only a finite number of values. 

Zero-one arrays in MATLAB 

The treatment above hides the fact that a rectangular array of zeros and ones can have two quite different 
meanings and functions in MATLAB. 

1. A numerical matrix (or vector) subject to the usual operations on matrices.. 

2. A logical array whose elements are combined by a. Logical operators to give new logical arrays; b. 
Array operations (element by element) to give numerical matrices; c. Array operations with numerical 
matrices to give numerical results. 

Some simple examples will illustrate the principal properties. 

3>> A = minterm(3,l) ; 
3> B = minterm(3,2) ; 
3> C = minterm(3,3) ; 

> F = (A&B) I (~B&C) 

F=0 1 1 1 1 

> G = A I (~A&C) 

G=0 1 1 1 1 1 1 

3> islogical(A) '/, Test for logical array 

ans = 

3> islogical(F) 

ans = 1 

3> m = max(A,B) '/, A matrix operation 

m=0 1 1 1 1 1 1 

3> islogical(m) 

ans = 

3> ml = A I B '/, A logical operation 

ml = 1 1 1 1 1 1 

3> islogical(ml) 

ans = 1 

3> a = logical (A) '/. Converts 0-1 matrix into logical array 

a=0 1 1 1 1 

> b = logical (B) 

> m2 = alb 
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m2=0 1 1 1 1 1 1 

> p = dot(A,B) '/. Equivalently, p = A*B' 
p = 2 

> pi = total (A. *b) 
pi = 2 

> p3 = total (A. *B) 
p3 = 2 

3> p4 = a*b' °/o Cannot use matrix operations on logical arrays 

??? Error using ==> mtimes '/, MATLAB error signal 
Logical inputs must be scalar. 

Often it is desirable to have a table of the generating minterm vectors. Use of the function minterm in a 
simple "for loop" yields the following m-function. 

The function mintable(n) Generates a table of minterm vectors for n generating sets. 

Example 2.8: Mintable for three variables 





> M3 = mintable (3) 










M3 
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1 



As an application of mintable, consider the problem of determining the probability of k of n events. If 
{Ai : 1 < i < n} is any finite class of events, the event that exactly k of these events occur on a trial can be 
characterized simply in terms of the minterm expansion. The event Aj-n that exactly k occur is given by 



ifcn 



the disjoint union of those minterms with exactly k positions uncomplemented (2.22) 



In the matrix M = mintable (n) these are the minterms corresponding to columns with exactly k ones. The 
event B^ n that k or more occur is given by 



Bkn = v A rn (2.23) 



r— k 

If we have the minterm probabilities, it is easy to pick out the appropriate minterms and combine the 
probabilities. The following example in the case of three variables illustrates the procedure. 

Example 2.9: The software survey (continued) 

In the software survey problem, the minterm probabilities are 

pm= [0 0.05 0.10 0.05 0.20 0.10 0.40 0.10] (2.24) 

where A = event has word processor, B = event has spread sheet, C = event has a data base 
program. It is desired to get the probability an individual selected has k of these, k = 0, 1, 2, 3. 

SOLUTION 

We form a mintable for three variables. We count the number of "successes" corresponding to 
each minterm by using the MATLAB function sum, which gives the sum of each column. In this 
case, it would be easy to determine each distinct value and add the probabilities on the minterms 
which yield this value. For more complicated cases, we have an m-function called csort (for sort 
and consolidate) to perform this operation. 

> pm = 0.01* [0 5 10 5 20 10 40 10]; 
> M = mintable (3) 
M = 
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> T = sum(M) 












T = 1 
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°/. Column sums give number 
3 '/, of successes on each 
3> [k,pk] = csort (T,pm) ; '/, minterm, determines 

'/, distinct values in T and 
3> disp( [k;pk] ' ) '/, consolidates probabilities 



1.0000 0.3500 

2.0000 0.5500 

3.0000 0.1000 

For three variables, it is easy enough to identify the various combinations "by eye" and make the 
combinations. For a larger number of variables, however, this may become tedious. The approach 
is much more useful in the case of Independent Events, because of the ease of determining the 
minterm probabilities. 

Minvec procedures 

Use of the tilde ~ to indicate the complement of an event is often awkward. It is customary to indicate 
the complement of an event E by E c . In MATLAB, we cannot indicate the superscript, so we indicate the 
complement by E c instead of ~ E. To facilitate writing combinations, we have a family of minvec procedures 
(minvec3, minvec4, ..., minveclO) to expedite expressing Boolean combinations of n = 3,4,5, • • • , 10 sets. 
These generate and name the minterm vector for each generating set and its complement. 

Example 2.10: Boolean combinations using minvec3 

We wish to generate a matrix whose rows are the minterm vectors for £1 = A U A c , A, AB, ABC, 
C, and A C C C , respectively. 

3> minvec3 '/, Call for the setup procedure 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired 

> V = [AlAc; A; A&B; A&B&C; C; Ac&Cc] ; '/. Logical combinations (one per 

'/, row) yield logical vectors 



> disp(V) 
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'/, Mixed logical and 











1 


1 


1 


1 


'/, numerical vectors 

















1 


1 






















1 




1 





1 





1 





1 




1 


1 




















Minterm probabilities and Boolean combination 

If we have the probability of every minterm generated by a finite class, we can determine the probability of 
any Boolean combination of the members of the class. When we know the minterm expansion or, equivalently, 
the minterm vector, we simply pick out the probabilities corresponding to the minterms in the expansion 
and add them. In the following example, we do this "by hand" then show how to do it with MATLAB . 

Example 2.11 

Consider E = A(BuC c )U A C (B U C c f and F = A C B C U AC of the example above, and suppose 
the respective minterm probabilities are 

p Q = 0.21, Pl = 0.06, p 2 = 0.29, p 3 = 0.11, p 4 = 0.09, p 5 = 0.03, p 6 = 0.14, p 7 = 0.07 (2.25) 
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Use of a minterm map shows E = M (1, 4, 6, 7) and F = M (0, 1, 5, 7). so that 

P(E)= Pl +p 4 +p 6 +p 7 =p(l, 4, 6, 7) = 0.36 and P(F) = p(0, 1, 5, 7) = 0.37 (2.26) 

This is easily handled in MATLAB. 

• Use minvec3 to set the generating minterm vectors. 

• Use logical matrix operations 

E = {Ak (B\Cc)) | {Ack ( (B\Cc))) and F = {AckBc) \ {AkC) (2.27) 

to obtain the (logical) minterm vectors for E and F. 

• If pm is the matrix of minterm probabilities, perform the algebraic dot product or scalar 
product of the pm matrix and the minterm vector for the combination. This can be called 
for by the MATLAB commands PE = E*pm' and PF = F*pm' . 

The following is a transcript of the MATLAB operations. 

3> minvec3 '/, Call for the setup procedure 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
>E = (Afe(BlCc)) I (Acfe-(BlCc)); 

> F = (Ac&Bc) I (A&C) ; 

> pm = 0.01* [21 6 29 11 9 3 14 7]; 

3> PE = E*pm' '/, Picks out and adds the minterm probabilities 

PE = 0.3600 

> PF = F*pm' 
PF = 0.3700 

Example 2.12: Solution of the software survey problem 

We set up the matrix equations with the use of MATLAB and solve for the minterm probabilities. 
From these, we may solve for the desired "target" probabilities. 

3> minvec3 
Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Data vector combinations are: 

> DV = [AlAc; A; B; C; A&B&C; Ac&Bc; (A&B) I (A&C) I (B&C) ; (A&Bc&C) - 2*(Ac&B&C)] 

°/. Data mixed numerical 
7, and logical vectors 
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'/, Corresponding data probabilities 
'/, Solution for minterm probabilities 

'/. Roundoff -3.5 x 10-17 
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0.2000 

0.1000 

0.4000 

0.1000 

> TV = [(A&B&Cc) I (A&Bc&C) I (Ac&B&C) ; Ac&Bc&C] '/. Target combinations 

TV = 

00010110 '/. Target vectors 

01000000 

3> PV = TV*pm '/, Solution for target probabilities 

PV = 

0.5500 '/. Target probabilities 

0.0500 

Example 2.13: An alternate approach 

The previous procedure first obtained all minterm probabilities, then used these to determine 
probabilities for the target combinations. The following procedure does not require calculation of the 
minterm probabilities. Sometimes the data are not sufficient to calculate all minterm probabilities, 
yet are sufficient to allow determination of the target probabilities. 

Suppose the data minterm vectors are linearly independent, and the target minterm vectors 
are linearly dependent upon the data vectors (i.e., the target vectors can be expressed as linear 
combinations of the data vectors). Now each target probability is the same linear combination of 
the data probabilities. To determine the linear combinations, solve the matrix equation 

TV = CT* DV which has the MATLAB solution CT = TV/DV (2.28) 

Then the matrix tp of target probabilities is given by tp = CT * DP' . Continuing the MATLAB 
procedure above, we have: 

> CT = TV/DV; 

> tp = CT*DP' 

tp = 0.5500 
0.0500 



2.2.2 The procedure mincalc 

The procedure mincalc performs calculations as in the preceding examples. The refinements consist of 
determining consistency and computability of various individual minterm probabilities and target probilities. 
The consistency check is principally for negative minterm probabilities. The computability tests are tests 
for linear independence by means of calculation of ranks of various matrices. The procedure picks out the 
computable minterm probabilities and the computable target probabilities and calculates them. 
To utilize the procedure, the problem must be formulated appropriately and precisely, as follows: 

1. Use the MATLAB program minvecq to set minterm vectors for each of q basic events. 

2. Data consist of Boolean combinations of the basic events and the respective probabilities of these 
combinations. These are organized into two matrices: 

• The data vector matrix DV has the data Boolean combinations- one on each row. MATLAB 
translates each row into the minterm vector for the corresponding Boolean combination. The 
first entry (on the first row) is A |Ac (for A\J A c ), which is the whole space. Its minterm vector 
consists of a row of ones. 
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• The data probability matrix DP is a row matrix of the data probabilities. The first entry is one, 
the probability of the whole space. 

3. The objective is to determine the probability of various target Boolean combinations. These are put 
into the target vector matrix TV, one on each row. MATLAB produces the minterm vector for each 
corresponding target Boolean combination. 

Computational note. In mincalc, it is necessary to turn the arrays DV and TV consisting of zero-one 
patterns into zero-one matrices. This is accomplished for DV by the operation DV = ones (size (DV)) . *DV. 
and similarly for TV. Both the original and the transformed matrices have the same zero-one pattern, but 
MATLAB interprets them differently. 
Usual case 

Suppose the data minterm vectors are linearly independent and the target vectors are each linearly depen- 
dent on the data minterm vectors. Then each target minterm vector is expressible as a linear combination 
of data minterm vectors. Thus, there is a matrix CT such that TV = CT * DV. MATLAB solves this 
with the command CT = TV/DV. The target probabilities are the same linear combinations of the data 
probabilities. These are obtained by the MATLAB operation tp = DP * CT . 
Cautionary notes 

The program mincalc depends upon the provision in MATLAB for solving equations when less than full 
data are available (based on the singular value decomposition). There are several situations which should 
be dealt with as special cases. It is usually a good idea to check results by hand to determine whether they 
are consistent with data. The checking by hand is usually much easier than obtaining the solution unaided, 
so that use of MATLAB is advantageous even in questionable cases. 

1. The Zero Problem. If the total probability of a group of minterms is zero, then it follows that the 
probability of each minterm in the group is zero. However, if mincalc does not have enough information 
to calculate the separate minterm probabilities in the case they are not zero, it will not pick up in the 
zero case the fact that the separate minterm probabilities are zero. It simply considers these minterm 
probabilities not computable. 

2. Linear dependence. In the case of linear dependence, the operation called for by the command CT = 
TV/DV may not be able to solve the equations. The matrix may be singular, or it may not be able to 
decide which of the redundant data equations to use. Should it provide a solution, the result should 
be checked with the aid of a minterm map. 

3. Consistency check. Since the consistency check is for negative minterms, if there are not enough data 
to calculate the minterm probabilities, there is no simple check on the consistency. Sometimes the 
probability of a target vector included in another vector will actually exceed what should be the larger 
probability. Without considerable checking, it may be difficult to determine consistency. 

4. In a few unusual cases, the command CT = TV/DV does not operate appropriately, even though the 
data should be adequate for the problem at hand. Apparently the approximation process does not 
converge. 

MATLAB Solutions for examples using mincalc 
Example 2.14: Software survey 

'/, file mcalcOl Data for software survey 
minvec3; 

DV = [A I Ac; A; B; C; A&B&C; Ac&Bc; (A&B) I (A&C) I (B&C) ; (A&Bc&C) - 2*(Ac&B&C)]; 
DP = [1 0.8 0.65 0.3 0.1 0.05 0.65 0] ; 
TV = [(A&B&Cc) I (A&Bc&C) I (Ac&B&C); Ac&Bc&C] ; 
disp('Call for mincalc') 
> mcalcOl '/. Call for data 

Call for mincalc '/, Prompt supplied in the data file 
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3> mincalc 

Data vectors are linearly independent 

Computable target probabilities 

1.0000 0.5500 

2.0000 0.0500 

The number of minterms is 8 

The number of available minterms is 8 

Available minterm probabilities are in vector pma 

To view available minterm probabilities, call for PMA 

3> disp(PMA) '/, Optional call for minterm probabilities 













1. 


.0000 


0, 


,0500 


2. 


,0000 


0. 


,1000 


3. 


,0000 


0. 


,0500 


4. 


,0000 


0. 


,2000 


5. 


,0000 


0. 


,1000 


6. 


,0000 


0, 


,4000 


7. 


,0000 


0. 


,1000 



Example 2.15: Computer survey 



'/, file mcalc02.m Data for computer survey 
minvec3 

DV = [AlAc; A; B; C; A&B&C; A&C; (A&B) I (A&C) I (B&C) ; ... 
2*(B&C) - (A&C)]; 

DP = 0.001* [1000 565 515 151 51 124 212 0] ; TV = [A|B|C; Ac&Bc&C] ; 
disp('Call for mincalc') 

> mcalc02 

Call for mincalc 

3> mincalc 

Data vectors are linearly independent 

Computable target probabilities 

1.0000 0.9680 

2.0000 0.0160 

The number of minterms is 8 

The number of available minterms is 8 

Available minterm probabilities are in vector pma 

To view available minterm probabilities, call for PMA 

> disp(PMA) 








0, 


,0320 


1. 


,0000 


0. 


,0160 


2. 


,0000 


0. 


,3760 


3, 


,0000 


0, 


.0110 


4, 


,0000 


0. 


,3640 


5, 


,0000 


0. 


,0730 


6. 


,0000 


0. 


,0770 


7. 


,0000 


0. 


,0510 



Example 2.16 
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'/, file mcalc03.m Data for opinion survey 

minvec4 

DV = [AlAc; A; B; C; D; A&(B|Cc)&Dc; A|((B&C)|Dc) ; Ac&B&Cc&D; ... 
A&B&C&D; A&Bc&C; Ac&Bc&Cc&D; Ac&B&C; Ac&Bc&Dc; A&Cc; A&C&Dc; A&B&Cc&Dc] ; 
DP = 0.001* [1000 200 500 300 700 55 520 200 15 30 195 120 120 ... 

140 25 20] ; 
TV = [Ac&((B&Cc) I (Bc&C)); Al(BftCc)]; 
disp('Call for mincalc') 
3> mincalc03 
Call for mincalc 
^> mincalc 

Data vectors are linearly independent 
Computable target probabilities 
1.0000 0.4000 

2.0000 0.4800 

The number of minterms is 16 
The number of available minterms is 16 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 
3> disp(minmap(pma)) '/, Display arranged as on minterm map 

0.0850 0.0800 0.0200 0.0200 

0.1950 0.2000 0.0500 0.0500 

0.0350 0.0350 0.0100 0.0150 

0.0850 0.0850 0.0200 0.0150 

The procedure mincalct 

A useful modification, which we call mincalct, computes the available target probabilities, without check- 
ing and computing the minterm probabilities. This procedure assumes a data file similar to that for mincalc, 
except that it does not need the target matrix TV, since it prompts for target Boolean combination inputs. 
The procedure mincalct may be used after mincalc has performed its operations to calculate probabilities 
for additional target combinations. 

Example 2.17: (continued) Additional target datum for the opinion survey 

Suppose mincalc has been applied to the data for the opinion survey and that it is desired to 
determine P(AD\JBD C ). It is not necessary to recalculate all the other quantities. We may 
simply use the procedure mincalct and input the desired Boolean combination at the prompt. 

3> mincalct 
Enter matrix of target Boolean combinations (A&D) I (B&Dc) 
Computable target probabilities 

1.0000 0.2850 

Repeated calls for mcalct may be used to compute other target probabilities. 



2.3 Problems on Minterm Analysis 3 

Exercise 2.1 (Solution on p. 48.) 

Consider the class {A, B, C, D} of events. Suppose the probability that at least one of the events A 
or C occurs is 0.75 and the probability that at least one of the four events occurs is 0.90. Determine 
the probability that neither of the events A or C but at least one of the events B or D occurs. 



3 This content is available online at <http://cnx.Org/content/m24171/l.4/>. 
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Exercise 2.2 (Solution on p. 48.) 

1. Use minterm maps to show which of the following statements are true for any class {A, B, C}: 

a. Au{BC) c = AuBuB c C c 

b. {A U B) c = A C C U B C C 

c. AcABuACuBC 

2. Repeat part (1) using indicator functions (evaluated on minterms). 

3. Repeat part (1) using the m-procedure minvec3 and MATLAB logical operations. 

Exercise 2.3 (Solution on p. 48.) 

Use (1) minterm maps, (2) indicator functions (evaluated on minterms), (3) the m-procedure 
minvec3 and MATLAB logical operations to show that 

a. A (B U C c ) U A C BC C A {BC U C c ) U A C B 

b. A U A C BC = AB U BC U AC U AB C C C 

Exercise 2.4 (Solution on p. 48.) 

Minterms for the events {A, B, C, D}, arranged as on a minterm map are 

0.0168 0.0072 0.0252 0.0108 

0.0392 0.0168 0.0588 0.0252 

0.0672 0.0288 0.1008 0.0432 

0.1568 0.0672 0.2352 0.1008 

What is the probability that three or more of the events occur on a trial? Of exactly two? Of two 
or fewer? 

Exercise 2.5 (Solution on p. 49.) 

Minterms for the events {A, B, C, D, E}, arranged as on a minterm map are 

0.0216 0.0324 0.0216 0.0324 0.0144 0.0216 0.0144 0.0216 
0.0144 0.0216 0.0144 0.0216 0.0096 0.0144 0.0096 0.0144 
0.0504 0.0756 0.0504 0.0756 0.0336 0.0504 0.0336 0.0504 
0.0336 0.0504 0.0336 0.0504 0.0224 0.0336 0.0224 0.0336 

What is the probability that three or more of the events occur on a trial? Of exactly four? Of three 
or fewer? Of either two or four? 

Exercise 2.6 (Solution on p. 49.) 

Suppose P (A UB C C) =0.65, P (AC) =0.2, P{A C B) = 0.25 

P (A C C C ) = 0.25, P {BC C ) = 0.30. Determine P {{AC C U A C C) B c ). 
Then determine P {{AB C U A c ) C c ) and P {A c (B U C c )), if possible. 

Exercise 2.7 (Solution on p. 49.) 

Suppose P ((AB C U A C B) C) = 0.4, P (AB) = 0.2, P {A C C C ) = 0.3, P (A) = 0.6, P(C) = 0.5, 
and P{AB C C C ) = 0.1. Determine P(A C C C UAC), P {{AB C U A c ) C c ), and P{A c {BuC c )), if 
possible. 

Exercise 2.8 (Solution on p. 50.) 

Suppose P {A) = 0.6, P{C) = 0A, P {AC) = 0.3, P{A C B) = 0.2, 
and P{A C B C C C ) =0.1. 
Determine P {{A U B) C c ), P {AC C U A C C), and P {AC C U A C B), if possible. 

Exercise 2.9 (Solution on p. 50.) 

Suppose P {A) = 0.5, P {AB) = P {AC) = 0.3, and P {ABC C ) = 0.1. 
Determine P {A{BC C ) C ) and P {AB U AC U BC). 
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Then repeat with additional data P {A C B C C C ) = 0.1 and P{A C BC) = 0.05. 

Exercise 2.10 (Solution on p. 51.) 

Given: P {A) = 0.6, P{A C B C ) = 0.2, P {AC C ) = 0.4, and P{ACD C ) = 0.1. 
Determine P {A C B UA{C C U £>)). 

Exercise 2.11 (Solution on p. 52.) 

A survey of a represenative group of students yields the following information: 

• 52 percent are male 

• 85 percent live on campus 

• 78 percent are male or are active in intramural sports (or both) 

• 30 percent live on campus but are not active in sports 

• 32 percent are male, live on campus, and are active in sports 

• 8 percent are male and live off campus 

• 17 percent are male students inactive in sports 

a. What is the probability that a randomly chosen student is male and lives on campus? 

b. What is the probability of a male, on campus student who is not active in sports? 

c. What is the probability of a female student active in sports? 

Exercise 2.12 (Solution on p. 52.) 

A survey of 100 persons of voting age reveals that 60 are male, 30 of whom do not identify with 
a political party; 50 are members of a political party; 20 nonmembers of a party voted in the 
last election, 10 of whom are female. How many nonmembers of a political party did not vote? 
Suggestion Express the numbers as a fraction, and treat as probabilities. 

Exercise 2.13 (Solution on p. 52.) 

During a period of unsettled weather, let A be the event of rain in Austin, B be the event of rain 
in Houston, and C be the event of rain in San Antonio. Suppose: 

P {AB) = 0.35, P{AB C ) = 0.15, P {AC) = 0.20, P {AB C U A C B) = 0.45 (2.29) 

P (BC) = 0.30 P {B C C) = 0.05 P {A C B C C C ) = 0.15 (2.30) 



a. What is the probability of rain in all three cities? 

b. What is the probability of rain in exactly two of the three cities? 

c. What is the probability of rain in exactly one of the cities? 

Exercise 2.14 (Solution on p. 53.) 

One hundred students are questioned about their course of study and plans for graduate study. 
Let A = the event the student is male; B = the event the student is studying engineering; C = the 
event the student plans at least one year of foreign language; D = the event the student is planning 
graduate study (including professional school) . The results of the survey are: 

There are 55 men students; 23 engineering students, 10 of whom are women; 75 students will take 
foreign language classes, including all of the women; 26 men and 19 women plan graduate study; 13 
male engineering students and 8 women engineering students plan graduate study; 20 engineering 
students will take a foreign language and plan graduate study; 5 non engineering students plan 
graduate study but no foreign language courses; 11 non engineering, women students plan foreign 
language study and graduate study. 

a. What is the probability of selecting a student who plans foreign language classes and graduate 
study? 
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b. What is the probability of selecting a women engineer who does not plan graduate study? 

c. What is the probability of selecting a male student who either studies a foreign language but 
does not intend graduate study or will not study a foreign language but plans graduate study? 

Exercise 2.15 (Solution on p. 54.) 

A survey of 100 students shows that: 60 are men students; 55 students live on campus, 25 of 
whom are women; 40 read the student newspaper regularly, 25 of whom are women; 70 consider 
themselves reasonably active in student affairs — 50 of these live on campus; 35 of the reasonably 
active students read the newspaper regularly; All women who live on campus and 5 who live off 
campus consider themselves to be active; 10 of the on-campus women readers consider themselves 
active, as do 5 of the off campus women; 5 men are active, off-campus, non readers of the newspaper. 

a. How many active men are either not readers or off campus? 

b. How many inactive men are not regular readers? 

Exercise 2.16 (Solution on p. 54.) 

A television station runs a telephone survey to determine how many persons in its primary viewing 
area have watched three recent special programs, which we call a, b, and c. Of the 1000 persons 
surveyed, the results are: 

221 have seen at least a; 209 have seen at least b; 112 have seen at least c; 197 have seen at 
least two of the programs; 45 have seen all three; 62 have seen at least a and c; the number having 
seen at least a and b is twice as large as the number who have seen at least b and c. 

• (a) How many have seen at least one special? 

• (b) How many have seen only one special program? 

Exercise 2.17 (Solution on p. 54.) 

An automobile safety inspection station found that in 1000 cars tested: 



100 needed wheel alignment, brake repair, and headlight adjustment 
325 needed at least two of these three items 
125 needed headlight and brake work 
550 needed at wheel alignment 



a. How many needed only wheel alignment? 

b. How many who do not need wheel alignment need one or none of the other items? 

Exercise 2.18 (Solution on p. 55.) 

Suppose P(A(BUC)) = 0.3, P(A C ) = 0.6, and P (A C B C C C ) = 0.1. 

Determine P (B U C), P {{AB U A C B C ) C c U AC), and P {A c (B U C c )), if possible. 
Repeat the problem with the additional data P (A C BC) = 0.2 and P (A C B) = 0.3. 

Exercise 2.19 (Solution on p. 56.) 

A computer store sells computers, monitors, printers. A customer enters the store. Let A, B, C 
be the respective events the customer buys a computer, a monitor, a printer. Assume the following 
probabilities: 

• The probability P (AB) of buying both a computer and a monitor is 0.49. 

• The probability P (ABC C ) of buying both a computer and a monitor but not a printer is 
0.17. 

• The probability P (AC) of buying both a computer and a printer is 0.45. 

• The probability P (BC) of buying both a monitor and a printer is 0.39 

• The probability P (AC C V A C C) of buying a computer or a printer, but not both is 0.50. 



47 



The probability P (AB C V A C B) of buying a computer or a monitor, but not both is 0.43. 
The probability P (BC C V B C C) of buying a monitor or a printer, but not both is 0.43. 



a. What is the probability P (A), P (B), or P (C) of buying each? 

b. What is the probability of buying exactly two of the three items? 

c. What is the probability of buying at least two? 

d. What is the probability of buying all three? 

Exercise 2.20 (Solution on p. 56.) 

Data are P {A) = 0.232, P (B) = 0.228, P {ABC) = 0.045, P (AC) = 0.062, 
P (AB U AC U BC) = 0.197 and P (BC) = 2P (AC). 
Determine P (A U B U C) and P (A C B C C), if possible. 
Repeat, with the additional data P (C) = 0.230. 

Exercise 2.21 (Solution on p. 57.) 

Data are: P (A) = 0.4, P (AB) = 0.3, P (ABC) = 0.25, P (C) = 0.65, 

P (A C C C ) = 0.3. Determine available minterm probabilities and the following, 
if computable: 

P(AC C \JA C C), P(A c B c ), P(AUB), P(AB c ) (2.31) 

With only six items of data (including P (Q) = P (A\J A c ) = 1), not all minterms are available. 
Try the additional data P(A C BC C ) = 0.1 and P(A C B C ) = 0.3. Are these consistent and linearly 
independent? Are all minterm probabilities available? 

Exercise 2.22 (Solution on p. 58.) 

Repeat Exercise 2.21 with P (AB) changed from 0.3 to 0.5. What is the result? Explain the reason 
for this result. 

Exercise 2.23 (Solution on p. 59.) 

Repeat Exercise 2.21 with the original data probability matrix, but with AB replaced by AC in 
the data vector matrix. What is the result? Does mincalc work in this case? Check results on a 
minterm map. 
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Solutions to Exercises in Chapter 2 

Solution to Exercise 2.1 (p. 43) 

Use the pattern P (E U F) = P (E) + P {E C F) and {A U Cf = A C C C . 

P(AuCuBUD) = P(AU C)+P (A C C C (B U D)) , so that P (A C C C (B U D)) = (2.32) 
0.90-0.75 = 0.15 

Solution to Exercise 2.2 (p. 44) 

We use the MATLAB procedure, which displays the essential patterns. 



minvec3 














Variables are A, B, C. 


, Ac, 


Be, Cc 










They may be renamed, : 
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E = A|~(B&C); 














F = A|B| (Bc&Cc); 














disp([E;F]) 
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G = ~(A|B); 














H = (Ac&C) I (Bc&C) ; 














disp([G;H]) 
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K = (A&B) I (A&C) I (B&C) 
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Solution to Exercise 2.3 (p. 44) 

We use the MATLAB procedure, which displays the essential patterns. 

minvec3 
Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
E = (Aft(BlCc)) I (Ac&B&C); 
F= (Aft ((B&C) |Cc)) I (Ac&B); 
disp([E;F]) 

1 1 1 1 '/. E subset of F 

00111011 
G = A I (Ac&B&C); 

H = (A&B) I (B&C) I (A&C) I (A&Bc&Cc) ; 
disp([G;H]) 

1 1 1 1 1 '/. G = H 

00011111 

Solution to Exercise 2.4 (p. 44) 

We use mintable(4) and determine positions with correct number(s) of ones (number of occurrences). An 
alternate is to use minvec4 and express the Boolean combinations which give the correct number(s) of ones. 

npr02_04 (Section~17. 8 . 1 : npr02_04) 
Minterm probabilities are in pm. Use mintable(4) 
a = mintable(4); 
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s = sum(a) ; '/, Number of ones in each minterm position 

PI = (s>=3)*pm' '/, Select and add minterm probabilities 

PI = 0.4716 

P2 = (s==2)*pm' 

P2 = 0.3728 

P3 = (s<=2)*pm' 

P3 = 0.5284 

Solution to Exercise 2.5 (p. 44) 

We use mintable(5) and determine positions with correct number(s) of ones (number of occurrences). 

npr02_05 (Section~17. 8 . 2 : npr02_05) 
Minterm probabilities are in pm. Use mintable(5) 
a = mintable(5) ; 

s = sum(a) ; '/, Number of ones in each minterm position 

PI = (s>=3)*pm' '/, Select and add minterm probabilities 
PI = 0.5380 
P2 = (s==4)*pm' 
P2 = 0.1712 
P3 = (s<=3)*pm' 
P3 = 0.7952 
P4 = ((s==2) I (s==4))*pm' 
P4 = 0.4784 

Solution to Exercise 2.6 (p. 44) 

'/. file npr02_06.m (Section~17 .8 . 3 : npr02_06) '/. Data file 

'/, Data for Exercise~2.6 
minvec3 

DV = [A I Ac; Al(BcfeC); A&C; Ac&B; Ac&Cc; B&Cc] ; 
DP = [1 0.65 0.20 0.25 0.25 0.30]; 

TV = [ ( (A&Cc) I (Ac&C) ) &Bc ; ( (A&Bc) I Ac) &Cc ; Ac& (B I Cc) ] ; 
disp('Call for mincalc') 
npr02_06 '/, Call for data 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 
Data vectors are linearly independent 

Computable target probabilities 

1.0000 0.3000 '/. The first and third target probability 

3.0000 0.3500 '/, is calculated. Check with minterm map. 

The number of minterms is 8 

The number of available minterms is 4 

Available minterm probabilities are in vector pma 

To view available minterm probabilities, call for PMA 

Solution to Exercise 2.7 (p. 44) 

'/. file npr02_07.m (Section~17 .8 .4: npr02_07) 
'/, Data for Exercise~2.7 
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minvec3 

DV = [AlAc; ((A&Bc) I (Ac&B))&C; A&B; Ac&Cc; A; C; A&Bc&Cc] ; 

DP = [ 1 0.4 0.2 0.3 0.60.5 0.1]; 

TV = [(Ac&Cc) I (A&C); ( (A&Bc) I Ac)&Cc; Ac&(B|Cc)]; 

disp('Call for mincalc') 

npr02_07 '/. Call for data 

Variables are A, B, C, Ac, Be, Cc 

They may be renamed, if desired. 

Call for mincalc 

mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.7000 '/. All target probabilities calculable 
2.0000 0.4000 °/. even though not all minterms are available 
3.0000 0.4000 

The number of minterms is 8 

The number of available minterms is 6 

Available minterm probabilities are in vector pma 

To view available minterm probabilities, call for PMA 

Solution to Exercise 2.8 (p. 44) 

'/. file npr02_08.m (Section~17 .8 . 5 : npr02_08) 
'/, Data for Exercise~2.8 
minvec3 

DV = [A I Ac; A; C; A&C; Ac&B; Ac&Bc&Cc] ; 
DP = [ 1 0.60.4 0.3 0.2 0.1] ; 
TV = [(AlB)ftCc; (A&Cc) I (Ac&C) ; (A&Cc) I (Ac&B)] ; 
disp('Call for mincalc') 



npr02_08 '/. Call for data 

Variables are A, B, C, Ac, Be, Cc 

They may be renamed, if desired. 

Call for mincalc 

mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.5000 '/, All target probabilities calculable 
2.0000 0.4000 '/, even though not all minterms are available 
3.0000 0.5000 

The number of minterms is 8 

The number of available minterms is 4 

Available minterm probabilities are in vector pma 

To view available minterm probabilities, call for PMA 



Solution to Exercise 2.9 (p. 44) 

'/. file npr02_09.m (Section~17 .8 . 6 : npr02_09) 
'/, Data for Exercise~2.9 
minvec3 
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DV = [A I Ac; A; A&B; A&C; A&B&Cc] ; 
DP = [ 1 0.50.3 0.3 0.1] ; 

TV = [A&(~(B&Cc)) ; (A&B) I (A&C) I (B&C)] ; 
disp('Call for mincalc') 

°/. Modification for part 2 

'/. DV = [DV; Ac&Bc&Cc; Ac&B&C] ; 

'/. DP = [DP 0.1 . 05] ; 

npr02_09 '/. Call for data 

Variables are A, B, C, Ac, Be, Cc 

They may be renamed, if desired. 

Call for mincalc 

mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.4000 '/. Only the first target probability calculable 

The number of minterms is 8 

The number of available minterms is 4 

Available minterm probabilities are in vector pma 

To view available minterm probabilities, call for PMA 

DV = [DV; Ac&Bc&Cc; Ac&B&C]; '/. Modification of data 

DP = [DP 0.1 0.05] ; 

mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.4000 '/. Both target probabilities calculable 

2.0000 0.4500 '/, even though not all minterms are available 

The number of minterms is 8 

The number of available minterms is 6 

Available minterm probabilities are in vector pma 

To view available minterm probabilities, call for PMA 

Solution to Exercise 2.10 (p. 45) 

'/. file npr02_10.m (Section~17 .8 . 7 : npr02_10) 
'/, Data for Exercise~2 . 10 
minvec4 

DV = [A I Ac; A; Ac&Bc; A&Cc; A&C&Dc] ; 
DP = [1 0.6 0.2 0.4 0.1] ; 
TV = [(Ac&B) I (A&(Cc|D))]; 
disp('Call for mincalc') 
npr02_10 

Variables are A, B, C, D, Ac, Be, Cc, Dc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.7000 y, Checks with minterm map solution 

The number of minterms is 16 
The number of available minterms is 
Available minterm probabilities are in vector pma 
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To view available minterm probabilities, call for PMA 
Solution to Exercise 2.11 (p. 45) 

'/. file npr02_ll.m (Section~17 .8 . 8 : npr02_ll) 
'/, Data for Exercise~2 . 11 

°/ A = male; B = on campus; C = active in sports 
minvec3 

DV = [AlAc; A; B; A|C; B&Cc; A&B&C; A&Bc; A&Cc] ; 
DP = [ 1 0.52 0.85 0.78 0.30 0.32 0.08 0.17]; 

TV = [A&B; A&B&Cc; Ac&C] ; 
disp('Call for mincalc') 

npr02_ll 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.4400 

2.0000 0.1200 

3.0000 0.2600 
The number of minterms is 8 
The number of available minterms is 8 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 

Solution to Exercise 2.12 (p. 45) 

'/. file npr02_12.m (Section~17 .8 . 9 : npr02_12) 
'/. Data for Exercise~2 . 12 

'/, A = male; B = party member; C = voted last election 
minvec3 

DV = [A I Ac; A; A&Bc; B; Bc&C; Ac&Bc&C] ; 
DP = [ 1 0.60 0.30 0.50 0.20 0.10]; 
TV = [Bc&Cc] ; 
disp('Call for mincalc') 
npr02_12 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 
1.0000 0.3000 
The number of minterms is 8 
The number of available minterms is 4 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 
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Solution to Exercise 2.13 (p. 45) 

'/. file npr02_13.m (Section~17 .8 . 10: npr02_13) 
'/, Data for Exercise~2 . 13 

'/, A = rain in Austin; B = rain in Houston; 
'/, C = rain in San Antonio 
minvec3 

DV = [AlAc; A&B; A&Bc; A&C; (A&Bc) I (Ac&B) ; B&C; Bc&C; Ac&Bc&Cc] ; 
DP = [ 1 0.35 0.15 0.20 0.45 0.30 0.05 0.15]; 
TV = [A&B&C; (A&B&Cc) I (A&Bc&C) I (Ac&B&C) ; (A&Bc&Cc) I (Ac&B&Cc) I (Ac&Bc&C)] ; 
disp('Call for mincalc') 
npr02_13 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.2000 

2.0000 0.2500 

3.0000 0.4000 
The number of minterms is 8 
The number of available minterms is 8 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 

Solution to Exercise 2.14 (p. 45) 

'/. file npr02_14.m (Section~17 .8 . 11 : npr02_14) 
'/, Data for Exercise~2 . 14 
7, A = male; B = engineering; 
'/, C = foreign language; D = graduate study 
minvec4 
DV = [A I Ac; A; B; Ac&B; C; Ac&C; A&D; Ac&D; A&B&D; ... 

Ac&B&D; B&C&D; Bc&Cc&D; Ac&Bc&C&D] ; 
DP = [1 0.55 0.23 0.10 0.75 0.45 0.26 0.19 0.13 0.08 0.20 0.05 0.11]; 
TV = [C&D; Ac&Dc; A&( (C&Dc) I (Cc&D))] ; 
disp('Call for mincalc') 
npr02_14 

Variables are A, B, C, D, Ac, Be, Cc, Dc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.3900 

2.0000 0.2600 '/, Third target probability not calculable 
The number of minterms is 16 
The number of available minterms is 4 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 
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Solution to Exercise 2.15 (p. 46) 

'/. file npr02_15.m (Section~17 .8 . 12: npr02_15) 
'/, Data for Exercise~2 . 15 

7, A = men; B = on campus; C = readers; D = active 
minvec4 
DV = [AlAc; A; B; Ac&B; C; Ac&C; D; B&D; C&D; ... 

Ac&B&D; Ac&Bc&D; Ac&B&C&D; Ac&Bc&C&D; A&Bc&Cc&D] ; 
DP = [1 0.6 0.55 0.25 0.40 0.25 0.70 0.50 0.35 0.25 0.05 0.10 0.05 0.05]; 
TV = [A&D&(Cc|Bc); A&Dc&Cc] ; 
disp('Call for mincalc') 
npr02_15 

Variables are A, B, C, D, Ac, Be, Cc, Dc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.3000 

2.0000 0.2500 
The number of minterms is 16 
The number of available minterms is 8 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 

Solution to Exercise 2.16 (p. 46) 

'/. file npr02_16.m (Section~17 .8 . 13: npr02_16) 
'/, Data for Exercise~2 . 16 
minvec3 

DV = [A I Ac; A; B; C; (AftB) I (AftC) I (BftC) ; A&B&C; A&C; (A&B) -2* (B&C)] ; 
DP = [ 1 0.221 0.209 0.112 0.197 0.045 0.062 0] ; 

TV = [AlBlC; (A&Bc&Cc) I (Ac&B&Cc) I (Ac&Bc&C)]; 
npr02_16 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.3000 

2.0000 0.1030 
The number of minterms is 8 
The number of available minterms is 8 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 

Solution to Exercise 2.17 (p. 46) 



'/. file npr02_17.m (Section~17 .8 . 14: npr02_17) 
'/, Data for Exercise~2 . 17 
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'/, A = alignment; B = brake work; C = headlight 
minvec3 

DV = [A I Ac; A&B&C; (A&B) I (A&C) I (B&C) ; B&C; A ]; 
DP = [ 1 0.100 0.325 0.125 0.550]; 
TV = [A&Bc&Cc; Ac&(~(B&C) )] ; 
disp('Call for mincalc') 
npr02_17 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.2500 

2.0000 0.4250 
The number of minterms is 8 
The number of available minterms is 3 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 

Solution to Exercise 2.18 (p. 46) 

'/. file npr02_18.m (Section~17 .8 . 15: npr02_18) 
'/, Date for Exercise~2 . 18 
minvec3 

DV = [A I Ac; Aft(BlC); Ac; Ac&Bc&Cc] ; 
DP = [ 1 0.3 0.6 0.1] ; 

TV = [BlC; (((A&B) I (Ac&Bc))&Cc) I (A&C); Ac&(B|Cc)]; 
disp('Call for mincalc') 
'/, Modification 
'/. DV = [DV; Ac&B&C; Ac&B] ; 
'/. DP = [DP 0.2 0.3]; 
npr02_18 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.8000 

2.0000 0.4000 
The number of minterms is 8 
The number of available minterms is 2 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 
DV = [DV; Ac&B&C; Ac&B]; '/. Modified data 
DP = [DP 0.2 0.3]; 

mincalc '/, New calculation 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.8000 

2.0000 0.4000 
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3.0000 0.4000 
The number of minterms is 8 
The number of available minterms is 5 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 

Solution to Exercise 2.19 (p. 46) 

'/. file npr02_19.m (Section~17 .8 . 16: npr02_19) 
'/. Data for Exercise~2 . 19 

'/, A = computer; B = monitor; C = printer 
minvec3 
DV = [A I Ac; A&B; A&B&Cc; A&C; B&C; (A&Cc) I (Ac&C) ; ... 

(A&Bc) I (Ac&B) ; (B&Cc) I (Bc&C)] ; 
DP = [1 0.49 0.17 0.45 0.39 0.50 0.43 0.43]; 

TV = [A; B; C; (A&B&Cc) I (A&Bc&C) I (Ac&B&C) ; (A&B) I (A&C) I (B&C) ; A&B&C] ; 
disp('Call for mincalc') 
npr02_19 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.8000 

2.0000 0.6100 

3.0000 0.6000 

4.0000 0.3700 

5.0000 0.6900 

6.0000 0.3200 
The number of minterms is 8 
The number of available minterms is 8 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 

Solution to Exercise 2.20 (p. 47) 

'/. file npr02_20.m (Section~17 .8 . 17: npr02_20) 
'/. Data for Exercise~2 . 20 
minvec3 

DV = [A I Ac; A; B; A&B&C; A&C; (A&B) I (A&C) I (B&C) ; B&C - 2* (A&C)]; 
DP = [ 1 0.232 0.228 0.045 0.062 0.197 0] ; 

TV = [AlBlC; Ac&Bc&Cc] ; 
disp('Call for mincalc') 
'/, Modification 
•/. DV = [DV; C] ; 
'/. DP = [DP . 230 ] ; 

npr02_20 (Section~17 . 8 . 17 : npr02_20) 
Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
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mincalc 

Data vectors are linearly independent 

Data probabilities are INCONSISTENT 

The number of minterms is 8 

The number of available minterms is 6 

Available minterm probabilities are in vector pma 

To view available minterm probabilities, call for PMA 

disp(PMA) 

2.0000 0.0480 

3.0000 -0.0450 '/, Negative minterm probabilities indicate 

4.0000 -0.0100 '/. inconsistency of data 

5.0000 0.0170 

6.0000 0.1800 

7.0000 0.0450 
DV = [DV ; C] ; 
DP = [DP 0.230]; 
mincalc 

Data vectors are linearly independent 
Data probabilities are INCONSISTENT 
The number of minterms is 8 
The number of available minterms is 8 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 

Solution to Exercise 2.21 (p. 47) 

'/. file npr02_21.m (Section~17 .8 . 18: npr02_21) 
'/, Data for Exercise~2 . 21 
minvec3 

DV = [AlAc; A; A&B; A&B&C; C; Ac&Cc] ; 
DP = [ 1 0.4 0.3 0.25 0.65 0.3]; 
TV = [(A&Cc) I (Ac&C); Ac&Bc; A|B; A&Bc] ; 
disp('Call for mincalc') 
'/. Modification 
'/. DV = [DV; Ac&B&Cc; Ac&Bc]; 
X DP = [DP 0.1 0.3]; 

npr02_21 (Section~17. 8 . 18: npr02_21) 
Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.3500 

4.0000 0.1000 
The number of minterms is 8 
The number of available minterms is 4 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 
DV = [DV; Ac&B&Cc; Ac&Bc]; 
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DP = [DP 0.1 0.3]; 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.3500 

2.0000 0.3000 

3.0000 0.7000 

4.0000 0.1000 
The number of minterms is 8 
The number of available minterms is 8 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 

Solution to Exercise 2.22 (p. 47) 

'/. file npr02_22.m (Section~17 .8 . 19: npr02_22) 
'/, Data for Exercise~2 . 22 
minvec3 

DV = [AlAc; A; A&B; A&B&C; C; Ac&Cc] ; 
DP = [ 1 0.4 0.5 0.25 0.65 0.3]; 
TV = [(A&Cc) I (Ac&C); Ac&Bc; A|B; A&Bc] ; 
disp('Call for mincalc') 
'/, Modification 
'/. DV = [DV; Ac&B&Cc; Ac&Bc]; 
'/. DP = [DP 0.1 0.3]; 

npr02_22 (Section~17. 8 . 19: npr02_22) 
Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Data probabilities are INCONSISTENT 
The number of minterms is 8 
The number of available minterms is 4 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 
disp(PMA) 

4.0000 -0.2000 

5.0000 0.1000 

6.0000 0.2500 

7.0000 0.2500 
DV = [DV; Ac&B&Cc; Ac&Bc]; 
DP = [DP 0.1 0.3]; 
mincalc 

Data vectors are linearly independent 
Data probabilities are INCONSISTENT 
The number of minterms is 8 
The number of available minterms is 8 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 
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disp(PMA) 







0.2000 


1.0000 


0.1000 


2.0000 


0.1000 


3.0000 


0.2000 


4.0000 


-0.2000 


5.0000 


0.1000 


6.0000 


0.2500 


7.0000 


0.2500 



Solution to Exercise 2.23 (p. 47) 

'/. file npr02_23.m (Section~17 .8 . 20: npr02_23) 
'/. Data for Exercise~2 . 23 
minvec3 

DV = [AlAc; A; A&C; A&B&C; C; Ac&Cc] ; 
DP = [ 1 0.4 0.3 0.25 0.65 0.3]; 
TV = [(A&Cc) I (Ac&C); Ac&Bc; A|B; A&Bc] ; 
disp('Call for mincalc') 
°/. Modification 
'/. DV = [DV; Ac&B&Cc; Ac&Bc]; 
°/. DP = [DP 0.1 0.3]; 
npr02_23 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are NOT linearly independent 
Warning: Rank deficient, rank = 5 tol = 5.0243e-15 

Computable target probabilities 
1.0000 0.4500 
The number of minterms is 8 
The number of available minterms is 2 
Available minterm probabilities are in vector pma 
To view available minterm probabilities, call for PMA 
DV = [DV; Ac&B&Cc; Ac&Bc]; 
DP = [DP 0.1 0.3]; 
mincalc 

Data vectors are NOT linearly independent 
Warning: Matrix is singular to working precision. 

Computable target probabilities 

1 Inf °/. Note that p(4) and p(7) are given in data 

2 Inf 

3 Inf 

The number of minterms is 8 

The number of available minterms is 6 

Available minterm probabilities are in vector pma 

To view available minterm probabilities, call for PMA 
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Chapter 3 

Conditional Probability 



3.1 Conditional Probability 1 

3.1.1 Introduction 

The probability P (A) of an event A is a measure of the likelihood that the event will occur on any trial. 
Sometimes partial information determines that an event C has occurred. Given this information, it may 
be necessary to reassign the likelihood for each event A. This leads to the notion of conditional probability. 
For a fixed conditioning event C, this assignment to all events constitutes a new probability measure which 
has all the properties of the original probability measure. In addition, because of the way it is derived from 
the original, the conditional probability measure has a number of special properties which are important in 
applications. 

3.1.2 Conditional probability 

The original or prior probability measure utilizes all available information to make probability assignments 
P{A) , P{B), etc., subject to the defining conditions (PI), (P2), and (P3) (p. 11). The probability P {A) 
indicates the likelihood that event A will occur on any trial. 

Frequently, new information is received which leads to a reassessment of the likelihood of event A. For 
example 

• An applicant for a job as a manager of a service department is being interviewed. His resume shows 
adequate experience and other qualifications. He conducts himself with ease and is quite articulate in 
his interview. He is considered a prospect highly likely to succeed. The interview is followed by an 
extensive background check. His credit rating, because of bad debts, is found to be quite low. With 
this information, the likelihood that he is a satisfactory candidate changes radically. 

• A young woman is seeking to purchase a used car. She finds one that appears to be an excellent buy. 
It looks "clean," has reasonable mileage, and is a dependable model of a well known make. Before 
buying, she has a mechanic friend look at it. He finds evidence that the car has been wrecked with 
possible frame damage that has been repaired. The likelihood the car will be satisfactory is thus 
reduced considerably. 

• A physician is conducting a routine physical examination on a patient in her seventies. She is somewhat 
overweight. He suspects that she may be prone to heart problems. Then he discovers that she exercises 
regularly, eats a low fat, high fiber, variagated diet, and comes from a family in which survival well 
into their nineties is common. On the basis of this new information, he reassesses the likelihood of 
heart problems. 



1 This content is available online at <http://cnx.Org/content/m23252/l.6/>. 
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New, but partial, information determines a conditioning event C , which may call for reassessing the likelihood 
of event A. For one thing, this means that A occurs iff the event AC occurs. Effectively, this makes C a new 
basic space. The new unit of probability mass is P{C). How should the new probability assignments be 
made? One possibility is to make the new assignment to A proportional to the probability P(AC). These 
considerations and experience with the classical case suggests the following procedure for reassignment. 
Although such a reassignment is not logically necessary, subsequent developments give substantial evidence 
that this is the appropriate procedure. 

Definition. If C is an event having positive probability, the conditional probability of A, given C is 



P(A\C) 



P{AC) 



P{C) 
For a fixed conditioning event C, we have a new likelihood assignment to the event A. Now 



(3.1) 



P(A\C) > 0, P(Q\C) -- 



1, and P V/ A j\C 



P(C) 



(3.2) 



Thus, the new function P{ ■ \C) satisfies the three defining properties (PI), (P2), and (P3) (p. 11) for 
probability, so that for fixed C, we have a new probability measure, with all the properties of an ordinary 
probability measure. 

Remark. When we write P (A\C) we are evaluating the likelihood of event A when it is known that event 
C has occurred. This is not the probability of a conditional event A\C. Conditional events have no meaning 
in the model we are developing. 

Example 3.1: Conditional probabilities from joint frequency data 

A survey of student opinion on a proposed national health care program included 250 students, of 
whom 150 were undergraduates and 100 were graduate students. Their responses were categorized 
Y (affirmative), N (negative), and D (uncertain or no opinion). Results are tabulated below. 





Y 


N 


D 


u 


60 


40 


50 


G 


70 


20 


10 



Table 3.1 

Suppose the sample is representative, so the results can be taken as typical of the student body. 
A student is picked at random. Let Y be the event he or she is favorable to the plan, N be the 
event he or she is unfavorable, and D is the event of no opinion (or uncertain). Let U be the event 
the student is an undergraduate and G be the event he or she is a graduate student. The data may 
reasonably be interpreted 



P (G) = 100/250, P (U) = 150/250, P (Y) = (60 + 70) /250, P (YU) = 60/250, etc. (3.3) 



Then 



P(Y\U) 



P{YU) _ 60/250 60 
P(U) ~ 150/250 ~ 150 



(3.4) 
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Similarly, we can calculate 

P(N\U) = 40/150, P(D\U) = 50/150, P(F|G) = 70/100, P(JV|G) = 20/100, P(D|G) = 10/100 

(3.5) 
We may also calculate directly 

P (U\Y) = 60/130, P(G|AT) = 20/60, etc. (3.6) 

Conditional probability often provides a natural way to deal with compound trials carried out in several 
steps. 

Example 3.2: Jet aircraft with two engines 

An aircraft has two jet engines. It will fly with only one engine operating. Let Fj be the event 
one engine fails on a long distance flight, and F2 the event the second fails. Experience indicates 
that P (Pi) = 0.0003. Once the first engine fails, added load is placed on the second, so that 
P(F 2 |Fi) = 0.001. Now the second engine can fail only if the other has already failed. Thus 
F 2 C Pi so that 

P (P 2 ) = P (PiP 2 ) = P (Pi) P (P 2 |Pi) = 3 x 10- 7 (3.7) 

Thus reliability of any one engine may be less than satisfactory, yet the overall reliability may be 
quite high. 

The following example is taken from the UMAP Module 576, by Paul Mullenix, reprinted in UMAP Journal, 
vol 2, no. 4. More extensive treatment of the problem is given there. 

Example 3.3: Responses to a sensitive question on a survey 

In a survey, if answering "yes" to a question may tend to incriminate or otherwise embarrass the 
subject, the response given may be incorrect or misleading. Nonetheless, it may be desirable to 
obtain correct responses for purposes of social analysis. The following device for dealing with this 
problem is attributed to B. G. Greenberg. By a chance process, each subject is instructed to do 
one of three things: 

1. Respond with an honest answer to the question. 

2. Respond "yes" to the question, regardless of the truth in the matter. 

3. Respond "no" regardless of the true answer. 

Let A be the event the subject is told to reply honestly, B be the event the subject is instructed 
to reply "yes," and C be the event the answer is to be "no." The probabilities P{A), P{B), and 
P (G) are determined by a chance mechanism (i.e., a fraction P (A) selected randomly are told to 
answer honestly, etc.). Let E be the event the reply is "yes." We wish to calculate P(E\A), the 
probability the answer is "yes" given the response is honest. 

SOLUTION 

Since E = EA V B, we have 

P(E)=P {EA) + P(B) = P (E\A) P{A) + P (B) (3.8) 

which may be solved algebraically to give 

PM- M (3-9) 
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Suppose there are 250 subjects. The chance mechanism is such that P (A) = 0.7, P (B) = 0.14 and 
P (C) = 0.16. There are 62 responses "yes," which we take to mean P (E) = 62/250. According to 
the pattern above 

, , ,, 62/250- 14/100 27 

P (E A) = —!- ; '- = « 0.154 3.10 

V ' ; 70/100 175 V ; 

The formulation of conditional probability assumes the conditioning event C is well defined. Sometimes 
there are subtle difficulties. It may not be entirely clear from the problem description what the conditioning 
event is. This is usually due to some ambiguity or misunderstanding of the information provided. 

Example 3.4: What is the conditioning event? 

Five equally qualified candidates for a job, Jim, Paul, Richard, Barry, and Evan, are identified on 
the basis of interviews and told that they are finalists. Three of these are to be selected at random, 
with results to be posted the next day. One of them, Jim, has a friend in the personnel office. Jim 
asks the friend to tell him the name of one of those selected (other than himself). The friend tells 
Jim that Richard has been selected. Jim analyzes the problem as follows. 

ANALYSIS 

Let Ai,l < i < 5 be the event the ith of these is hired (Aj is the event Jim is hired, A3 is the 
event Richard is hired, etc.). Now P (A;) (for each i) is the probability that finalist i is in one of 
the combinations of three from five. Thus, Jim's probability of being hired, before receiving the 
information about Richard, is 

P(A 1 )= 1 * ( ; i \ 2) =- = P(A i ), Ki<5 (3.11) 

v u C(5,3) 10 v ' ~ ~ v ' 

The information that Richard is one of those hired is information that the event A3 has occurred. 
Also, for any pair i ^ j the number of combinations of three from five including these two is just 
the number of ways of picking one from the remaining three. Hence, 



P (^3) = 777^ = 77T = P (^)> *^3 ( 3 - 12 ) 



C(3,l) _ 3 
C(5,3) ~ 10 
The conditional probability 



P(A 1 |A 3) = P( f^ ) =^ = l/2 (3.13) 

v ' ; P{M) 6/10 ' y ' 

This is consistent with the fact that if Jim knows that Richard is hired, then there are two to be 
selected from the four remaining finalists, so that 

f <^> = ^Br = 5 = ^ < 3 -»> 

Discussion 

Although this solution seems straightforward, it has been challenged as being incomplete. Many feel that 
there must be information about how the friend chose to name Richard. Many would make an assumption 
somewhat as follows. The friend took the three names selected: if Jim was one of them, Jim's name was 
removed and an equally likely choice among the other two was made; otherwise, the friend selected on an 
equally likely basis one of the three to be hired. Under this assumption, the information assumed is an event 
B 3 which is not the same as A 3 . In fact, computation (see Example 5, below) shows 

P(A 1 |B 3 ) =^ = P(A 1 )^P(A 1 \A 3 ) (3.15) 
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Both results are mathematically correct. The difference is in the conditioning event, which corresponds to 
the difference in the information given (or assumed). 

— □ 

3.1.3 Some properties 

In addition to its properties as a probability measure, conditional probability has special properties which 
are consequences of the way it is related to the original probability measure P (■). The following are easily 
derived from the definition of conditional probability and basic properties of the prior probability measure, 
and prove useful in a variety of problem situations. 

(CP1) Product rule If P (ABCD) > 0, then P {ABCD) = P {A) P (B\A) P (C\AB) P (D\ABC). 

Derivation 

The defining expression may be written in product form: P (AB) = P (A) P (B\A). Likewise 

P (ABC) = P (A) ^^ • ^f^ =P(A)P (B\A) P (C\AB) (3.16) 

and 

P [ABCD] = P [A) ^1 ■ P ^£) . P ^BCD) =p{A)p {B]A) p {C]AB) p {D]ABC) (317) 

This pattern may be extended to the intersection of any finite number of events. Also, the events may be 
taken in any order. 

— □ 

Example 3.5: Selection of items from a lot 

An electronics store has ten items of a given type in stock. One is defective. Four successive 
customers purchase one of the items. Each time, the selection is on an equally likely basis from 
those remaining. What is the probability that all four customes get good items? 

SOLUTION 

Let Ej be the event the ith customer receives a good item. Then the first chooses one of the 
nine out of ten good ones, the second chooses one of the eight out of nine goood ones, etc., so that 

P{E X E 2 E Z E A ) = P(E 1 )P(E 2 \E 1 )P(E 3 \E 1 E 2 )P(E 4 \E 1 E 2 E 3 ) = A . * . I . | = A (3.I8) 

Note that this result could be determined by a combinatorial argument: under the assumptions, 
each combination of four of ten is equally likely; the number of combinations of four good ones is 
the number of combinations of four of the nine. Hence 

P [ExEkE^EA = 5/ 9 ' 4 l = — = 3/5 (3.19) 

\ l 2 s a C(10,4) 210 ' V ' 

Example 3.6: A selection problem 

Three items are to be selected (on an equally likely basis at each step) from ten, two of which are 
defective. Determine the probability that the first and third selected are good. 

SOLUTION 

Let Gi, 1 < i < 3 be the event the ith unit selected is good. Then G1G3 = G\G 2 G 3 \J G\G 2 G^. 
By the product rule 

P (G.Gs) = P (G,) P (Gald) P (G 3 \G 1 G 2 ) + P (G,) P {G^G,) P (G^G*) = ± ■ (3.20) 

Z.6,_8_.2.7 = 28^ nfi9 

9 8 ~ 10 9 8 45 ~ 
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(CP2) Law of total probability Suppose the class {Ai : 1 < i < n) of events is mutually exclusive and 
every outcome in E is in one of these events. Thus, E = A\E\J A2E V ' ' ' V A n E, a disjoint union. Then 

P(E) = P (E\A X ) P {A x ) + P (E\A 2 ) P (A 2 ) + --- + P (E\A n ) P (A n ) (3.21) 

Example 3.7 
A compound experiment. 

Five cards are numbered one through five. A two-step selection procedure is carried out as follows. 

1. Three cards are selected without replacement, on an equally likely basis. 

• If card 1 is drawn, the other two are put in a box 

• If card 1 is not drawn, all three are put in a box 

2. One of cards in the box is drawn on an equally likely basis (from either two or three) 

Let Aj be the event the ith card is drawn on the first selection and let B; be the event the card 
numbered i is drawn on the second selection (from the box). Determine P{B§), P(AiB^), and 
P(Ai\B B ). 

SOLUTION 

From Example 3.4 (What is the conditioning event?), we have P (Ai) = 6/10 and P (AiAj) = 
3/10. This implies 

P (AA^) = P (A,) - P [A.Aj) = 3/10 (3.22) 

Now we can draw card five on the second selection only if it is selected on the first drawing, so 
that B 5 C A 5 . Also A 5 = A x A 5 V A\A- We therefore have B 5 = B 5 A 5 = B 5 AiA 5 V B5AJA5. By 
the law of total probability (CP2) (p. 66), 

P(B B ) =P(B 5 \A 1 A 5 )P(A 1 A 5 )+P(B 5 \A C 1 A 5 )P(A C 1 A 5 ) = I.± + I.A = I (3.23) 

Also, since A1B5 = A1A5B5, 



P (AxSs) = P (AxAaBs) = P [A x A b ) P [B b \A x A b ) = A . I = 1 (3.24) 



We thus have 



Occurrence of event Bj has no affect on the likelihood of the occurrence of Aj. This condition is 
examined more thoroughly in the chapter on "Independence of Events" (Section 4.1). 

Often in applications data lead to conditioning with respect to an event but the problem calls for "conditioning 
in the opposite direction." 

Example 3.8: Reversal of conditioning 

Students in a freshman mathematics class come from three different high schools. Their mathe- 
matical preparation varies. In order to group them appropriately in class sections, they are given 
a diagnostic test. Let H; be the event that a student tested is from high school i, 1 < i < 3. Let F 
be the event the student fails the test. Suppose data indicate 

P (#1) = 0.2, P (H 2 ) = 0.5, P {H 3 ) = 0.3, P {F\H X ) = 0.10, P (F\H 2 ) = 0.02, P (F\H 3 ) = 0.06 

(3.26) 
A student passes the exam. Determine for each i the conditional probability P (Hi\F c ) that the 
student is from high school i. 
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SOLUTION 

p (F c ) = P (F c |i/i) P (Hi) + P (F C \H 2 ) P (H 2 ) + P (F C \H 3 ) P (H 3 ) = 0.90 • 0.2 + (3.27) 
0.98-0.5 + 0.94-0.3 = 0.952 



Then 



Similarly, 



, , rs P(F C H 1 ) P (F C \HA P (HA 180 

P (Hx \F C ) = — y — — ^ = — ' = = 0.1891 3.28 

v H I P(F C ) P(F C ) 952 v ' 



P(H 2 \FC) = P{FClH ; )l ; m = ™= 0-5147 and P^) = ^*"TO P (*,) = 282 = 

v ^ ^ P(P C ) 952 v d| ; P{F C ) 952 

(3.29) 
The basic pattern utilized in the reversal is the following. 

n 

(CP3) Bayes' rule If E C \J Ai (as in the law of total probability), then 

i=i 

P(A-E) P(E\A)P(A) 
P(Ai\E) = y l ' = y ' l ' y l ' \<i<n The law of total probability yields P (E) (3.30) 

Such reversals are desirable in a variety of practical situations. 

Example 3.9: A compound selection and reversal 

Begin with items in two lots: 

1. Three items, one defective. 

2. Four items, one defective. 

One item is selected from lot 1 (on an equally likely basis); this item is added to lot 2; a selection 
is then made from lot 2 (also on an equally likely basis). This second item is good. What is the 
probability the item selected from lot 1 was good? 

SOLUTION 

Let Gj be the event the first item (from lot 1) was good, and G2 be the event the second item 
(from the augmented lot 2) is good. We want to determine P (G1IG2). Now the data are interpreted 
as 

P(Gi) = 2/3, P(G 2 |Gi)=4/5, P(G 2 |GJ) = 3/5 (3.31) 

By the law of total probability (CP2) (p. 66), 

P{G 2 ) = P(G 1 )P(G 2 \G 1 ) + P(G C 1 )P(G 2 \G C 1 ) = t.l - i . '1 = ii (3.32) 

By Bayes' rule (CP3) (p. 67), 



2 4 13 

3 ' 5 + 3 ' 5 ~ 


11 

15 


= ±,0,3 





P(rlC) _ P(G 2 |G 1 )P(G 1 ) _ 4/5 x 2/3 _ 8 

p(Gi|G2) - — pm — - 11/15 - it ~ °- 73 (3 - 3,); 

Example 3.10: Additional problems requiring reversals 

• Medical tests. Suppose D is the event a patient has a certain disease and T is the event a 
test for the disease is positive. Data are usually of the form: prior probability P (D) (or prior 
odds P (D) IP {D c )), probability P (T|P> C ) of a false positive, and probability P (T C |D) of a 
false negative. The desired probabilities are P (D\T) and P(D C \T C ). 
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• Safety alarm. If D is the event a dangerous condition exists (say a steam pressure is too 
high) and T is the event the safety alarm operates, then data are usually of the form P (D), 
P{T\D C ), and P(T C \D), or equivalently (e.g., P (T C \D C ) and P{T\D)). Again, the desired 
probabilities are that the safety alarms signals correctly, P (D\T) and P(D C \T C ). 

• Job success. If H is the event of success on a job, and E is the event that an individual inter- 
viewed has certain desirable characteristics, the data are usually prior P (H) and reliability of 
the characteristics as predictors in the form P (E\H) and P(E\H C ). The desired probability 
is P(H\E). 

• Presence of oil. If H is the event of the presence of oil at a proposed well site, and E is the 
event of certain geological structure (salt dome or fault), the data are usually P (H) (or the 
odds), P(E\H), and P{E\H C ). The desired probability is P{H\E). 

• Market condition. Before launching a new product on the national market, a firm usually 
examines the condition of a test market as an indicator of the national market. If H is the 
event the national market is favorable and E is the event the test market is favorable, data 
are a prior estimate P (H) of the likelihood the national market is sound, and data P (E\H) 
and P (E\H C ) indicating the reliability of the test market. What is desired is P (H\E) , the 
likelihood the national market is favorable, given the test market is favorable. 

The calculations, as in Example 3.8 (Reversal of conditioning), are simple but can be tedious. We have an 
m-procedure called bayes to perform the calculations easily. The probabilities P (Aj) are put into a matrix 
PA and the conditional probabilities P (E\Ai) are put into matrix PEA. The desired probabilities P (Ai\E) 
and P (Ai\E c ) are calculated and displayed 

Example 3.11: MATLAB calculations for Example 3.8 (Reversal of conditioning) 

> PEA = [0.10 0.02 0.06] ; 
> PA = [0.2 0.5 0.3]; 
3> bayes 

Requires input PEA = [P(E|A1) P(E|A2) ... P(E|An)] 
and PA = [P(A1) P(A2) ... P(An)] 
Determines PAE = [P(A1|E) P(A2|E) ... P(An|E)] 

and PAEc = [P(Al|Ec) P(A2|Ec) ... P(An|Ec)] 
Enter matrix PEA of conditional probabilities PEA 
Enter matrix PA of probabilities PA 



P(E) = 0. 


,048 






P(E|Ai) 


P(Ai) 


P(Ai|E) 


P(Ai|Ec) 


0.1000 


0.2000 


0.4167 


0.1891 


0.0200 


0.5000 


0.2083 


0.5147 


0.0600 


0.3000 


0.3750 


0.2962 



Various quantities are in the matrices PEA, PA, PAE, PAEc, named above 

The procedure displays the results in tabular form, as shown. In addition, the various quantities are 
in the workspace in the matrices named, so that they may be used in further calculations without 
recopying. 

The following variation of Bayes' rule is applicable in many practical situations. 

(CP3*) Ratio form of Bayes' rule ^|g| = ^g| = ^p§0j • -^ 

The left hand member is called the posterior odds, which is the odds after knowledge of the occurrence 
of the conditioning event. The second fraction in the right hand member is the prior odds, which is the odds 
before knowledge of the occurrence of the conditioning event C. The first fraction in the right hand member 
is known as the likelihood ratio. It is the ratio of the probabilities (or likelihoods) of C for the two different 
probability measures P( ■ \A) and P( ■ \B). 
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Example 3.12: A performance test 

As a part of a routine maintenance procedure, a computer is given a performance test. The machine 
seems to be operating so well that the prior odds it is satisfactory are taken to be ten to one. The 
test has probability 0.05 of a false positive and 0.01 of a false negative. A test is performed. The 
result is positive. What are the posterior odds the device is operating properly? 

SOLUTION 

Let S be the event the computer is operating satisfactorily and let T be the event the test is 
favorable. The data are P{S)/P{S C ) = 10, P {T\S C ) = 0.05, and P {T C \S) = 0.01. Then by the 
ratio form of Bayes' rule 

P(S\T) P(T\S) P(S) 0.99 , /ru x 198 , 

PWW) = AW) • ^W) = 0^ • 10 = 198 S ° ^^ P (5|T) = 199 = °-" 5 ° (3 - 34) 

The following property serves to establish in the chapters on "Independence of Events" (Section 4.1) and 
"Conditional Independence" (Section 5.1) a number of important properties for the concept of independence 
and of conditional independence of events. 

(CP4) Some equivalent conditions If < P (A) < 1 and < P (B) < 1, then 

P(A\B)*P(A) iff P(B\A)*P(B) iff P (AB) * P (A) P (B) and (3.35) 

P (AB) * P (A) P (B) iff P(A C B C )*P(A C )P(B C ) iff P (AB C ) o P {A) P (B c ) (3.36) 

where *is <,<,=,>, or > and ois > ,>,=,<, or < , respectively. 

Because of the role of this property in the theory of independence and conditional independence, we 
examine the derivation of these results. 

VERIFICATION of (CP4) (p. 69) 

a. P (AB) *P(A)P (B) iff P (A\B) * P (A) (divide by P (B) — may exchange A and A c ) 

b. P (AB) *P(A)P (B) iff P (B\A) * P (B) (divide by P (A) — may exchange B and B c ) 

c. P(AB) * P(A)P(B) iff [P (A) - P (AB C )} * P (A) [1 - P (B c ) iff -P(AB C ) * -P(A)P(B C ) iff 
P (AB C ) o P (A) P (B c ) 

d. We may use c to get P (AB) * P (A) P (B) iff P (AB C ) o P (A) P (B c ) iff P (A C B C ) * P (A c ) P (B c ) 

— □ 

A number of important and useful propositons may be derived from these. 

1. P (A\B) + P (A C \B) = 1, but, in general, P (A\B) + P (A\B C ) / 1. 

2. P (A\B) > P (A) iff P (A\B C ) < P (A). 

3. P (A C \B) > P (A c ) iff P (A\B) < P (A). 

4. P (A\B) > P (A) iff P (A C \B C ) > P (A c ). 

VERIFICATION — Exercises (see problem set) 
— □ 

3.1.4 Repeated conditioning 

Suppose conditioning by the event C has occurred. Additional information is then received that event D 
has occurred. We have a new conditioning event CD. There are two possibilities: 

1. Reassign the conditional probabilities. Pc (A) becomes 

P (Aim _ p c(AD) _ P(ACD) 

Pc {MD) ~ -p^Dj ~ ~ncDj (3 - 37) 
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2. Reassign the total probabilities: P (A) becomes 

P CD (A) = P (A\CD) = P ^D) ( 3 - 38 ) 

Basic result: Pc (A\D) = P(A\CD) = Po {A\C). Thus repeated conditioning by two events may be done 
in any order, or may be done in one step. This result extends easily to repeated conditioning by any finite 
number of events. This result is important in extending the concept of "Independence of Events" (Section 4.1) 
to "Conditional Independence" (Section 5.1). These conditions are important for many problems of probable 
inference. 

3.2 Problems on Conditional Probability 2 

Exercise 3.1 (Solution on p. 74.) 

Given the following data: 

P (.4) =0.55, P(AB) = 0.30, P(5C) = 0.20, P {A c U BC) = 0.55, P{A C BC C ) = 0.15 (3.39) 

Determine, if possible, the conditional probability P (A C \B) = P (A C B) jP (B). 

Exercise 3.2 (Solution on p. 74.) 

In Exercise 11 (Exercise 2.11) from "Problems on Minterm Analysis," we have the following data: 
A survey of a represenative group of students yields the following information: 

• 52 percent are male 

• 85 percent live on campus 

• 78 percent are male or are active in intramural sports (or both) 

• 30 percent live on campus but are not active in sports 

• 32 percent are male, live on campus, and are active in sports 

• 8 percent are male and live off campus 

• 17 percent are male students inactive in sports 

Let A = male, B = on campus, C = active in sports. 

• (a) A student is selected at random. He is male and lives on campus. What is the (conditional) 
probability that he is active in sports? 

• (b) A student selected is active in sports. What is the(conditional) probability that she is a 
female who lives on campus? 

Exercise 3.3 (Solution on p. 74.) 

In a certain population, the probability a woman lives to at least seventy years is 0.70 and is 0.55 
that she will live to at least eighty years. If a woman is seventy years old, what is the conditional 
probability she will survive to eighty years? Note that if A C B then P (AB) = P (A). 

Exercise 3.4 (Solution on p. 74.) 

From 100 cards numbered 00, 01, 02, • • •, 99, one card is drawn. Suppose A; is the event the sum 
of the two digits on a card is i, < i < 18, and Bj is the event the product of the two digits is j. 
Determine P (A^Bq) for each possible i. 

Exercise 3.5 (Solution on p. 74.) 

Two fair dice are rolled. 

a. What is the (conditional) probability that one turns up two spots, given they show different 
numbers? 



2 This content is available online at <http://cnx.Org/content/m24173/l.4/>. 
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b. What is the (conditional) probability that the first turns up six, given that the sum is k, for 
each k from two through 12? 

c. What is the (conditional) probability that at least one turns up six, given that the sum is k, 
for each k from two through 12? 

Exercise 3.6 (Solution on p. 75.) 

Four persons are to be selected from a group of 12 people, 7 of whom are women. 

a. What is the probability that the first and third selected are women? 

b. What is the probability that three of those selected are women? 

c. What is the (conditional) probability that the first and third selected are women, given that 
three of those selected are women? 

Exercise 3.7 (Solution on p. 75.) 

Twenty percent of the paintings in a gallery are not originals. A collector buys a painting. He has 
probability 0.10 of buying a fake for an original but never rejects an original as a fake, What is the 
(conditional) probability the painting he purchases is an original? 

Exercise 3.8 (Solution on p. 75.) 

Five percent of the units of a certain type of equipment brought in for service have a common 
defect. Experience shows that 93 percent of the units with this defect exhibit a certain behavioral 
characteristic, while only two percent of the units which do not have this defect exhibit that 
characteristic. A unit is examined and found to have the characteristic symptom. What is the 
conditional probability that the unit has the defect, given this behavior? 

Exercise 3.9 (Solution on p. 75.) 

A shipment of 1000 electronic units is received. There is an equally likely probability that there 
are 0, 1, 2, or 3 defective units in the lot. If one is selected at random and found to be good, what 
is the probability of no defective units in the lot? 

Exercise 3.10 (Solution on p. 75.) 

Data on incomes and salary ranges for a certain population are analyzed as follows. Si = event 
annual income is less than $25,000; S2 = event annual income is between $25,000 and $100,000; S3 = 
event annual income is greater than $100,000. E\ = event did not complete college education; E 2 = 
event of completion of bachelor's degree; E3 = event of completion of graduate or professional degree 
program. Data may be tabulated as follows: P {E{) = 0.65, P (E 2 ) = 0.30, and P {E 3 ) = 0.05. 



P(Si\Ej) 



(3.40) 





Si 


s 2 


S3 


E 1 


0.85 


0.10 


0.05 


E 2 


0.10 


0.80 


0.10 


E 3 


0.05 


0.50 


0.45 


P(Si) 


0.50 


0.40 


0.10 



Table 3.2 



a. Determine P(E 3 S3). 

b. Suppose a person has a university education (no graduate study). What is the (conditional) 
probability that he or she will make $25,000 or more? 
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c. Find the total probability that a person's income category is at least as high as his or her 
educational level. 

Exercise 3.11 (Solution on p. 75.) 

In a survey, 85 percent of the employees say they favor a certain company policy. Previous 
experience indicates that 20 percent of those who do not favor the policy say that they do, out of 
fear of reprisal. What is the probability that an employee picked at random really does favor the 
company policy? It is reasonable to assume that all who favor say so. 

Exercise 3.12 (Solution on p. 75.) 

A quality control group is designing an automatic test procedure for compact disk players coming 
from a production line. Experience shows that one percent of the units produced are defective. The 
automatic test procedure has probability 0.05 of giving a false positive indication and probability 
0.02 of giving a false negative. That is, if D is the event a unit tested is defective, and T is the event 
that it tests satisfactory, then P (T\D) = 0.05 and P {T C \D C ) = 0.02. Determine the probability 
P (D C \T) that a unit which tests good is, in fact, free of defects. 

Exercise 3.13 (Solution on p. 76.) 

Five boxes of random access memory chips have 100 units per box. They have respectively one, 
two, three, four, and five defective units. A box is selected at random, on an equally likely basis, 
and a unit is selected at random therefrom. It is defective. What are the (conditional) probabilities 
the unit was selected from each of the boxes? 

Exercise 3.14 (Solution on p. 76.) 

Two percent of the units received at a warehouse are defective. A nondestructive test procedure 
gives two percent false positive indications and five percent false negative. Units which fail to pass 
the inspection are sold to a salvage firm. This firm applies a corrective procedure which does not 
affect any good unit and which corrects 90 percent of the defective units. A customer buys a unit 
from the salvage firm. It is good. What is the (conditional) probability the unit was originally 
defective? 

Exercise 3.15 (Solution on p. 76.) 

At a certain stage in a trial, the judge feels the odds are two to one the defendent is guilty. It 
is determined that the defendent is left handed. An investigator convinces the judge this is six 
times more likely if the defendent is guilty than if he were not. What is the likelihood, given this 
evidence, that the defendent is guilty? 

Exercise 3.16 (Solution on p. 76.) 

Show that if P(A\C) > P(B\C) and P(A\C C ) > P{B\C C ), then P (A) > P(B). Is the converse 
true? Prove or give a counterexample. 

Exercise 3.17 (Solution on p. 76.) 

Since P (-\B) is a probability measure for a given B, we must have P (A\B) + P (A C \B) = 1. 
Construct an example to show that in general P (A\B) + P (A\B C ) ^ 1. 

Exercise 3.18 (Solution on p. 76.) 

Use property (CP4) (p. 69) to show 



a. 


P (A\B) > P {A) iff P (A\B C ) < P {A) 


b. 


P{A C \B)>P{A C ) iff P(A\B) <P{A) 


c. 


P(A\B)> P(A) iff P(A C \B C ) > P(A C ) 



Exercise 3.19 (Solution on p. 76.) 

Show that P (A\B) > (P {A) + P (B) - 1) /P (B). 

Exercise 3.20 (Solution on p. 76.) 

Show that P (A\B) = P {A\BC) P (C\B) + P {A\BC C ) P {C C \B). 
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Exercise 3.21 (Solution on p. 77.) 

An individual is to select from among n alternatives in an attempt to obtain a particular one. 
This might be selection from answers on a multiple choice question, when only one is correct. Let 
A be the event he makes a correct selection, and B be the event he knows which is correct before 
making the selection. We suppose P (B) = p and P (A\B C ) = 1/n. Determine P (B\A); show that 
P (B\A) > P (B) and P (B\A) increases with n for fixed p. 

Exercise 3.22 (Solution on p. 77.) 

Polya's urn scheme for a contagious disease. An urn contains initially b black balls and r red balls 
(r + b = n). A ball is drawn on an equally likely basis from among those in the urn, then replaced 
along with c additional balls of the same color. The process is repeated. There are n balls on the 
first choice, n + c balls on the second choice, etc. Let B^ be the event of a black ball on the irth 
draw and R^ be the event of a red ball on the irth draw. Determine 

a. P(B 2 \Ri) 

b. P(B 1 B 2 ) 

c. P(R 2 ) 

d. P{B 1 \R 2 ). 
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Solutions to Exercises in Chapter 3 

Solution to Exercise 3.1 (p. 70) 

'/. file npr03_01.m (Section~17 .8 . 21 : npr03_01) 
'/, Data for Exercise~3.1 
minvec3 

DV = [A I Ac; A; A&B; B&C; Ac I (B&C) ; Ac&B&Cc] ; 
DP = [ 1 0.55 0.30 0.20 0.55 0.15 ]; 

TV = [Ac&B; B] ; 
disp('Call for mincalc') 
npr03_01 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
Call for mincalc 
mincalc 

Data vectors are linearly independent 
Computable target probabilities 

1.0000 0.2500 

2.0000 0.5500 
The number of minterms is 8 
The number of available minterms is 4 



P = 0.25/0.55 
P = 0.4545 

Solution to Exercise 3.2 (p. 70) 



npr02_ll (Section~17.8.8: npr02_ll) 
mincalc 



mincalct 

Enter matrix of target Boolean combinations [A&B&C; A&B; Ac&B&C; C] 
Computable target probabilities 

1.0000 0.3200 

2.0000 0.4400 

3.0000 0.2300 

4.0000 0.6100 

PC_AB = 0.32/0.44 
PC_AB = 0.7273 
PAcB_C = 0.23/0.61 
PAcB_C = 0.3770 

Solution to Exercise 3.3 (p. 70) 

Let A = event she lives to seventy and B = event she lives to eighty. Since B C A, P (B\A) = 
P (AB) jP {A) = P (B) jP {A) = 55/70. 
Solution to Exercise 3.4 (p. 70) 

B is the event one of the first ten is drawn. AiB is the event that the card with numbers Oi is drawn. 
P(Ai\B ) = (1/100)/ (1/10) = 1/10 for each i, through 9. 
Solution to Exercise 3.5 (p. 70) 
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a. There are 6x5 ways to choose all different. There are 2x5 ways that they are different and one turns 
up two spots. The conditional probability is 2/6. 

b. Let A e = event first is a six and S k = event the sum is k. Now A 6 S k = for k < 6. A table of 
sums shows P {A 6 S k ) = 1/36 and P(S k ) = 6/36,5/36,4/36,3/36,2/36,1/36 for k = 7 through 12, 
respectively. Hence P (A 6 \S k ) = 1/6, 1/5, 1/4, 1/3, 1/2, 1, respectively. 

c. If ABq is the event at least one is a six, then P (AB^Sk) = 2/36 for fc = 7 through 11 and P (ABqSi 2 ) = 
1/36. Thus, the conditional probabilities are 2/6, 2/5, 2/4, 2/3, 1, 1, respectively. 

Solution to Exercise 3.6 (p. 71) 

P (W 1 W 3 ) = P (W 1 W 2 W 3 ) + P (W 1 W^W 3 ) = _._._ + _._._ = _ (3.41) 

Solution to Exercise 3.7 (p. 71) 

Let B = the event the collector buys, and G = the event the painting is original. Assume P (B\G) = 1 and 
P(B\G C ) = 0.1. If P(G) = 0.8, then 

P (G \B) = P{GB) = P(B\G)P(G) = 0.8 = 40 

{ ] ' P{B) P (B\G) P (G) + P (B\G C ) P (G c ) 0.8 + 0.1-0.2 41 v " ' 

Solution to Exercise 3.8 (p. 71) 

Let D = the event the unit is defective and C = the event it has the characteristic. Then P (D) = 0.05, 
P(C\D) = 0.93, and P {C\D C ) = 0.02. 

P {D \c) = P(C\D)P(D) = 0-93 • 0-05 = ^3_ 

V ' ; P (C\D) P (£>) + P {C\D C ) P {D c ) 0.93-0.05 + 0.02-0.95 131 K ' } 

Solution to Exercise 3.9 (p. 71) 

Let D k = the event of k defective and G be the event a good one is chosen. 

P(Dln = P(G\Dq)P(D ) 

1 0| ] P (G\D Q ) P (D ) + P (G|£>!) P (£>i) + P (G\D 2 ) P (D 2 ) + P (G\D 3 ) P (As) l " ' 



1-1/4 1000 



(3.45) 



(1/4) (1 + 999/1000 + 998/1000 + 997/1000) 3994 
Solution to Exercise 3.10 (p. 71) 

a. P {E 3 S 3 ) = P (S 3 \E 3 ) P {E 3 ) = 0.45 • 0.05 = 0.0225 

b. P (S 2 V S 3 \E 2 ) = 0.80 + 0.10 = 0.90 

c. p = (0.85 + 0.10 + 0.05) • 0.65 + (0.80 + 0.10) • 0.30 + 0.45 • 0.05 = 0.9425 

Solution to Exercise 3.11 (p. 72) 

P(S) = 0.85, P{S\F C ) = 0.20. Also, reasonable to assume P (S\F) = 1. 

P(S) = P (S\F) P(F) + P (S\F C ) [1-P (F)} implies P (F) = ^h^& = g (3.46) 

Solution to Exercise 3.12 (p. 72) 

P(D C \T) _ P(T\D C )P(D C ) _ 0.98-0.99 _ 9702 
P(D\T) ~ P{T\D)P{D) ~ 0.05 • 0.01 ~ "JT 



(3.47) 



P (D C \T) = — = 1 — (3.48) 

V ' ' 9707 9707 v ; 
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Solution to Exercise 3.13 (p. 72) 

H t = the event from box i. P (Hi) = 1/5 and P (D\H t ) = i/100. 

( tl ) = ^nD-\Hjp~W) = ' ' - - ( } 

Solution to Exercise 3.14 (p. 72) 

Let T = event test indicates defective, D = event initially defective, and G = event unit purchased is good. 
Data are 

P(P) = 0.02, P{T C \D) = 0.02, P (T\D C ) = 0.05, P (GT C ) = 0, (3.50) 

P (G\DT) = 0.90, P(G|B> C T) = 1 (3.51) 

P(£>|G)= F ffP , P(GP) = P(GTB>) = P(P)P(T|B>)P(G|TP>) (3.52) 

P(G) 

P(G) = P (GT) = P (GDT) + P {GD C T) = P (D) P (T\D) P (G\TD) + P {D c ) P {T\D C ) P {G\TD C ) (3.53) 

P (D\G) = °- 02 -°- 98 - - 90 = ill (3.54) 

v ' ; 0.02 -0.98 -0.90 + 0.98 -0.05- 1.00 1666 v ; 

Solution to Exercise 3.15 (p. 72) 

Let G = event the defendent is guilty, L = the event the defendent is left handed. Prior odds: 
P (G) jP (G c ) = 2. Result of testimony: P (L\G) /P {L\G C ) = 6. 

P{G\L) P(G) P{L\G) 

P{G°\L) ~ ~P~W) ' P(L\G°) " 2 • 6 " 12 ( 3 - 55 ) 

P(G\L) = 12/13 (3.56) 

Solution to Exercise 3.16 (p. 72) 

P{A) = P (A\C) P{C) + P {A\C C ) P (G c ) > P (B\C) P (G) + P (P|G C ) P (G c ) = P (P). 
The converse is not true. Consider P (C) = P (G c ) = 0.5, P (A|G) = 1/4, 
P (A|G C ) = 3/4, P (P|G) = 1/2, and P (P|G C ) = 1/4. Then 

1/2 = P(A)= l - (1/4 + 3/4) > l - (1/2 + 1/4) = P (P) = 3/8 (3.57) 

But P{A\C) <P(P|G). 
Solution to Exercise 3.17 (p. 72) 

Suppose Ad B with P {A) < P(B). Then P(A\B) = P{A)/P(B) < 1 and P {A\B C ) = so the sum is 
less than one. 
Solution to Exercise 3.18 (p. 72) 

a. P (A\B) > P {A) iff P (AB) > P {A) P (P) iff P {AB C ) < P {A) P (P c ) iff P (A|P C ) < P (A) 

b. P (A C |P) > P (A c ) iff P {A C B) > P {A c ) P (P) iff P (AB) < P (A) P (P) iff P (A|P) < P (A) 

c. P (A|P) > P (A) iff P (AB) > P{A)P (P) iff P (A C B C ) > P (A c ) P (P c ) iff P (.4 C |P C ) > P (A c ) 

Solution to Exercise 3.19 (p. 72) 

1 > P(AUB) = P(A) + P(B) - P (AB) = P(A) + P(B) - P (A\B) P (B). Simple algebra gives the 
desired result. 
Solution to Exercise 3.20 (p. 72) 

P(A\B) = ^1= nABC) + P(ABCC) 

V ' ' P(B) P(B) y ' 
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= P { A\BQP(BC) + P(A\BC<)P { BC<) = p ^^ p (c|fl) + p ( ^ C) p ^ (3 _ fig) 

Solution to Exercise 3.21 (p. 73) 

P (A|P) = 1, P {A\B C ) = 1/n, P{B)=p 

P <b\A) = P ( A \ B \ P ( B ) P m) + p (A\B C ) P (B c ) = * = ^ (3.60) 

1 ' ' PA\B { ' l ' j l j p+I(i_ p ) ( n _i) p + i ^ ' 

P(Bli) n , 

; — -^ = increases from 1 to 1/n as n — > oo (3.61) 

P(P) np+l-p ^ V y 

Solution to Exercise 3.22 (p. 73) 

a. P(P 2 |P 1 ) = ^ 

b. P(B 1 B 2 ) = P(B 1 )P(B 2 \B 1 ) = ±-^ c 

c. P (P 2 ) = P (P 2 |P0 P (Pi) + P (P 2 |P0 P (Pi) 



(3.62) 



r + c r r b r(r + c+b) 

n + c n n + c n n(n + c) 

d. P(Pi|P 2 ) = P(fl2 j^ (Bl) with P(P 2 |P 1 )P(P 1 ) = ^ • |. Using (c), we have 

P (Px |P 2 ) = J = -A- (3.63) 

r + o + c n + c 
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Chapter 4 
Independence of Events 



4.1 Independence of Events 1 

Historically, the notion of independence has played a prominent role in probability. If events form an 
independent class, much less information is required to determine probabilities of Boolean combinations 
and calculations are correspondingly easier. In this unit, we give a precise formulation of the concept of 
independence in the probability sense. As in the case of all concepts which attempt to incorporate intuitive 
notions, the consequences must be evaluated for evidence that these ideas have been captured successfully. 

4.1.1 Independence as lack of conditioning 

There are many situations in which we have an "operational independence." 

• Supose a deck of playing cards is shuffled and a card is selected at random then replaced with reshuffling. 
A second card picked on a repeated try should not be affected by the first choice. 

• If customers come into a well stocked shop at different times, each unaware of the choice made by the 
others, the the item purchased by one should not be affected by the choice made by the other. 

• If two students are taking exams in different courses, the grade one makes should not affect the grade 
made by the other. 

The list of examples could be extended indefinitely. In each case, we should expect to model the events as 
independent in some way. How should we incorporate the concept in our developing model of probability? 
We take our clue from the examples above. Pairs of events are considered. The "operational independence" 
described indicates that knowledge that one of the events has occured does not affect the likelihood that the 
other will occur. For a pair of events {A, B}, this is the condition 

P{A\B) = P{A) (4.1) 

Occurrence of the event A is not "conditioned by" occurrence of the event B. Our basic interpretation is that 
P (A) indicates of the likelihood of the occurrence of event A. The development of conditional probability 
in the module Conditional Probability (Section 3.1), leads to the interpretation of P (A\B) as the likelihood 
that A will occur on a trial, given knowledge that B has occurred. If such knowledge of the occurrence of B 
does not affect the likelihood of the occurrence of A, we should be inclined to think of the events A and B 
as being independent in a probability sense. 



1 This content is available online at <http://cnx.Org/content/m23253/l.6/>. 
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4,1,2 Independent pairs 

We take our clue from the condition P (A\B) = P(A). Property (CP4) (p. 69) for conditional probability 
(in the case of equality) yields sixteen equivalent conditions as follows. 



P(A\B) = P(A) 


P{B\A) = P{B) 


P(AB) = P{A)P{B) 


P(A\B C ) = P{A) 


P(B C \A) =P{B C ) 


P(AB C ) = P{A)P{B C ) 


P(A C \B) = P{A C ) 


P(B\A C ) = P(B) 


P{A C B) = P{A C )P{B) 


P{A C \B C ) = P{A C ) 


P{B C \A C ) = P{B C ) 


P(A C B C ) = P{A C )P(B C ) 



Table 4.1 



P(A\B) = P{A\B C ) P (A C \B) = P (A C \B C ) P {B\A) = P (B\A C ) P (B C \A) = P {B C \A 



Table 4.2 

These conditions are equivalent in the sense that if any one holds, then all hold. We may chose any one 
of these as the defining condition and consider the others as equivalents for the defining condition. Because 
of its simplicity and symmetry with respect to the two events, we adopt the product rule in the upper right 
hand corner of the table. 

Definition. The pair {A,B} of events is said to be (stochastically) independent iff the following product 
rule holds: 



P(AB) = P(A)P{B) 



(4.2) 



Remark. Although the product rule is adopted as the basis for definition, in many applications the assump- 
tions leading to independence may be formulated more naturally in terms of one or another of the equivalent 
expressions. We are free to do this, for the effect of assuming any one condition is to assume them all. 

The equivalences in the right-hand column of the upper portion of the table may be expressed as a 
replacement rule, which we augment and extend below: 

If the pair {A, B} independent, so is any pair obtained by taking the complement of either or both of 
the events. 

We note two relevant facts 

• Suppose event JV has probability zero (is a null event). Then for any event A, we have < P {AN) < 
P (N) = = P (A) P (N), so that the product rule holds. Thus {N, A} is an independent pair for any 
event A. 

• If event S has probability one (is an almost sure event), then its complement S c is a null event. By the 
replacement rule and the fact just established, {S C ,A} is independent, so {S, A} is independent. 

The replacement rule may thus be extended to: 
Replacement Rule 

If the pair {A, B} independent, so is any pair obtained by replacing either or both of the events by their 
complements or by a null event or by an almost sure event. 

CAUTION 

1. Unless at least one of the events has probability one or zero, a pair cannot be both independent 
and mutually exclusive. Intuitively, if the pair is mutually exclusive, then the occurrence of one 
requires that the other does not occur. Formally: Suppose < P '(A) < 1 and < P '(B) < 1. 
{A,B} mutually exclusive implies P (AB) = P (0) = / P(A)P(B). {A, B} independent implies 
P (AB) = P (A) P (B) > = P (0) 
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2. Independence is not a property of events. Two non mutually exclusive events may be independent under 
one probability measure, but may not be independent for another. This can be seen by considering 
various probability distributions on a Venn diagram or minterm map. 

4,1,3 Independent classes 

Extension of the concept of independence to an arbitrary class of events utilizes the product rule. 

Definition. A class of events is said to be (stochastically) independent iff the product rule holds for 
every finite subclass of two or more events in the class. 

A class {A, B, C} is independent iff all four of the following product rules hold 

P(AB) = P(A)P(B) P(AC) = P{A)P(C) P(BC) = (4.3) 

P(B)P(C) P (ABC) = P (A) P (B) P (C) 

If any one or more of these product expressions fail, the class is not independent. A similar situation 
holds for a class of four events: the product rule must hold for every pair, for every triple, and for the whole 
class. JVote that we say "not independent" or "nonindependent" rather than dependent. The reason for this 
becomes clearer in dealing with independent random variables. 

We consider some classical exmples of nonindependent classes 

Example 4.1: Some nonindependent classes 

1. Suppose {Ai,A2,A^,A4} is a partition, with each P {AA = 1/4. Let 

A = A 1 \/A 2 B = A 1 \/A 3 C = A 1 \/A 4 (4.4) 

Then the class {A, B,C} has P (A) = P (B) = P (C) = 1/2 and is pairwise independent, 
but not independent, since 

P (AB) = P (A{) = 1/4 = P {A) P (B) and similarly for the other pairs, but (4.5) 

P {ABC) = P (Ai) = 1/4 ^ P (A) P (B) P (C) (4.6) 

2. Consider the class {A,B,C,D} with AD = BD = 0, C = AB\J D, P {A) = P (B) = 1/4, 
P (AB) = 1/64, and P (D) = 15/64. Use of a minterm maps shows these assignments are 
consistent. Elementary calculations show the product rule applies to the class {A, B, C} but 
no two of these three events forms an independent pair. 

As noted above, the replacement rule holds for any pair of events. It is easy to show, although somewhat 

cumbersome to write out, that if the rule holds for any finite number k of events in an independent class, it 

holds for any k + 1 of them. By the principle of mathematical induction, the rule must hold for any finite 

subclass. We may extend the replacement rule as follows. 

General Replacement Rule 

If a class is independent, we may replace any of the sets by its complement, by a null event, or by an almost 

sure event, and the resulting class is also independent. Such replacements may be made for any number of 

the sets in the class. One immediate and important consequence is the following. 

Minterm Probabilities 

If {Ai : 1 < i < n} is an independent class and the the class {P (A,) : 1 < i < n} of individual probabilities 

is known, then the probability of every minterm may be calculated. 
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Example 4.2: Minterm probabilities for an independent class 

Suppose the class {A, B, C} is independent with respective probabilities P (A) = 0.3, P (B) = 0.6, 

and P (C) = 0.5. Then 

{A c , B c , C c } is independent and P (M ) = P {A c ) P {B c ) P (C c ) = 0.14 

{A c , B c , C} is independent and P (Ml) = P {A c ) P {B c ) P (C) = 0.14 

Similarly, the probabilities of the other six minterms, in order, are 0.21, 0.21, 0.06, 0.06, 0.09, 

and 0.09. With these minterm probabilities, the probability of any Boolean combination of A, B, 

and C may be calculated 

In general, eight appropriate probabilities must be specified to determine the minterm probabilities for a 
class of three events. In the independent case, three appropriate probabilities are sufficient. 

Example 4.3: Three probabilities yield the minterm probabilities 

Suppose {A, B, C} is independent with P {A U BC) = 0.51, P {AC C ) = 0.15, and P(A) = 0.30. 
Then P (C c ) = 0.15/0.3 = 0.5 = P (C) and 

P (A) + P (A c ) P (B) P (C) = 0.51 so that P {B) = °' 5 *, - °' 3 ° =0.6 (4.7) 

U.i X U.O 

With each of the basic probabilities determined, we may calculate the minterm probabilities, hence 
the probability of any Boolean combination of the events. 

Example 4.4: MATLAB and the product rule 

Frequently we have a large enough independent class {Ei, E 2 , ■■■ , E n } that it is desirable to 
use MATLAB (or some other computational aid) to calculate the probabilities of various "and" 
combinations (intersections) of the events or their complements. Suppose the independent class 
{.Ei, E 2 , ■ ■ ■ , Eio} has respective probabilities 

0.13 0.37 0.12 0.56 0.33 0.71 0.22 0.43 0.57 0.31 (4.8) 

It is desired to calculate (a) P (E 1 E 2 E§E 4 EgE%E 7 ), and (b) P (E^E 2 E^E 4 E^E^E 7 EsE§E ia ). 
We may use the MATLAB function prod and the scheme for indexing a matrix. 

> p = 0.01* [13 37 12 56 33 71 22 43 57 31]; 

> q = i-p; 

3> '/, First case 

> e = [1 2 4 7] ; '/, Uncomplemented positions 
» f = [3 5 6] ; °/„ Complemented positions 

3> P = prod(p(e) ) *prod(q(f ) ) '/, p(e) probs of uncomplemented factors 

P = 0.0010 '/, q(f) probs of complemented factors 
3> 7. Case of uncomplemented in even positions; complemented in odd positions 

3> g = f ind(rem(l : 10,2) == 0); '/, The even positions 

> h = find(rem(l:10,2) ~= 0); '/, The odd positions 

> P = prod(p(g))*prod(q(h)) 
P = 0.0034 

In the unit on MATLAB and Independent Classes, we extend the use of MATLAB in the calculations 
for such classes. 
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4.2 MATLAB and Independent Classes 2 
4.2.1 MATLAB and Independent Classes 

In the unit on Minterms (Section 2.1), we show how to use minterm probabilities and minterm vectors to 
calculate probabilities of Boolean combinations of events. In Independence of Events we show that in the 
independent case, we may calculate all minterm probabilities from the probabilities of the basic events. 
While these calculations are straightforward, they may be tedious and subject to errors. Fortunately, in this 
case we have an m-function minprob which calculates all minterm probabilities from the probabilities of the 
basic or generating sets. This function uses the m-function mintable to set up the patterns of p's and q's for 
the various minterms and then takes the products to obtain the set of minterm probabilities. 

Example 4.5 

> pm = minprob (0.1* [4 7 6]) 

pm = 0.0720 0.1080 0.1680 0.2520 0.0480 0.0720 0.1120 0.1680 

It may be desirable to arrange these as on a minterm map. For this we have an m-function minmap 
which reshapes the row matrix pm, as follows: 

3> t = minmap (pm) 
t = 0.0720 0.1680 0.0480 0.1120 

0.1080 0.2520 0.0720 0.1680 

Probability of occurrence of k of n independent events 

In Example 2, we show how to use the m-functions mintable and csort to obtain the probability of the 
occurrence of k of n events, when minterm probabilities are available. In the case of an independent class, 
the minterm probabilities are calculated easily by minprob, It is only necessary to specify the probabilities 
for the n basic events and the numbers k of events. The size of the class, hence the mintable, is determined, 
and the minterm probabilities are calculated by minprob. We have two useful m-functions. If P is a matrix 
of the n individual event probabilities, and k is a matrix of integers less than or equal to n, then 

function y = ikn (P, k) calculates individual probabilities that k of n occur 

function?/ = ckn (P, k) calculates the probabilities that k or more occur 

Example 4.6 

> p = 0.01* [13 37 12 56 33 71 22 43 57 31]; 

> k = [2 5 7]; 

> P = ikn(p.k) 

P = 0.1401 0.1845 0.0225 '/. individual probabilities 

> Pc = ckn(p,k) 

Pc = 0.9516 0.2921 0.0266 '/. cumulative probabilities 

Reliability of systems with independent components 

Suppose a system has n components which fail independently. Let E; be the event the ith component 
survives the designated time period. Then Ri = P (Ej) is defined to be the reliability of that component. 
The reliability R of the complete system is a function of the component reliabilities. There are three basic 
configurations. General systems may be decomposed into subsystems of these types. The subsystems become 
components in the larger configuration. The three fundamental configurations are: 

1. Series. The system operates iff all n components operate: R = Y\?—i R-% 

2. Parallel. The system operates iff not all components fail: R = 1 — n"=i 0- ~ ^*) 



2 This content is available online at <http://cnx.Org/content/m23255/l.6/>. 



84 



CHAPTER 4. INDEPENDENCE OF EVENTS 



3. k of n. The system operates iff k or more components operate. R may be calculated with the m- 
function ckn. If the component probabilities are all the same, it is more efficient to use the m-function 
cbinom (see Bernoulli trials and the binomial distribution, below). 

MATLAB solution. Put the component reliabilities in matrix RC = [Rl R2 ■ ■ ■ Rn] 

1. Series Configuration 

> R = prod(RC) '/. prod is a built in MATLAB function 

2. Parallel Configuration 

3> R = parallel (RC) 7, parallel is a user defined function 

3. k of n Configuration 

3> R = ckn(RC,k) % ckn is a user defined function (in file ckn.m) . 

Example 4.7 

There are eight components, numbered 1 through 8. Component 1 is in series with a parallel 
combination of components 2 and 3, followed by a 3 of 5 combination of components 4 through 8 
(see Figure 1 for a schematic representation). Probabilities of the components in order are 

0.95 0.90 0.92 0.80 0.83 0.91 0.85 0.85 (4.9) 

The second and third probabilities are for the parallel pair, and the last five probabilities are for 
the 3 of 5 combination. 

> RC = 0.01* [95 90 92 80 83 91 85 85]; '/. Component reliabilities 

> Ra = RC(l)*parallel(RC(2:3))*ckn(RC(4:8),3) '/, Solution 
Ra = 0.9172 




Figure 4.1: Schematic representation of the system in Example 4.7 



Example 4.8 
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> RC = 0.01* [95 90 92 80 83 91 85 85]; '/. Component reliabilities 1- 
> Rb = prod(RC(l:2))*parallel([RC(3),ckn(RC(4:8),3)]) '/. Solution 
Rb = 0.8532 
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Figure 4.2: Schematic representation of the system in Example 4.8 



A test for independence 

It is difficult to look at a list of minterm probabilities and determine whether or not the generating events 
form an independent class. The m-function imintest has as argument a vector of minterm probabilities. It 
checks for feasible size, determines the number of variables, and performs a check for independence. 

Example 4.9 



> pm = 0.01*[15 5 2 18 25 5 18 12]; 
3> disp (imintest (pm) ) 

The class is NOT independent 

Minterms for which the product rule fails 

1110 
1110 



'/. An arbitrary class 



Example 4.10 



> pm = [0.10 0.15 0.20 0.25 0.30]: '/.An improper number of probabilities 

3> disp (imintest (pm) ) 

The number of minterm probabilities incorrect 



Example 4.11 
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> pm = minprob ([0.5 0.3 0.7]); 
3> disp(imintest (pm) ) 
The class is independent 



4,2,2 Probabilities of Boolean combinations 

As in the nonindependent case, we may utilize the minterm expansion and the minterm probabilities to 
calculate the probabilities of Boolean combinations of events. However, it is frequently more efficient to 
manipulate the expressions for the Boolean combination to be a disjoint union of intersections. 

Example 4.12: A simple Boolean combination 

Suppose the class {A, B, C} is independent, with respective probabilities 0.4, 0.6, 0.8. Determine 
P (A U BC). The minterm expansion is 



A U BC = M (3, 4, 5, 6, 7) , so that P {A U BC) = p (3, 4, 5, 6, 7) 



(4.10) 



It is not difficult to use the product rule and the replacement theorem to calculate the needed 
minterm probabilities. Thus p (3) = P {A c ) P (B) P (C) = 0.6 • 0.6 • 0.8 = 0.2280. Similarly 
p(4) = 0.0320, p(5) = 0.1280, p(6) = 0.0480, p(7) = 0.1920 . The desired probability is the 
sum of these, 0.6880. 

As an alternate approach, we write 



AUBC = A\J A c BC, so that P{AU BC) = 0.4 + 0.6 • 0.6 • 0.8 



0.6880 



(4.11) 



Considerbly fewer arithmetic operations are required in this calculation. 

In larger problems, or in situations where probabilities of several Boolean combinations are to be deter- 
mined, it may be desirable to calculate all minterm probabilities then use the minterm vector techniques 
introduced earlier to calculate probabilities for various Boolean combinations. As a larger example for which 
computational aid is highly desirable, consider again the class and the probabilities utilized in Example 4.6, 
above. 

Example 4.13 

Consider again the independent class {E\, E2, ■■■ , -E10} with respective probabilities 
{0.13 0.37 0.12 0.56 0.33 0.71 0.22 0.43 0.57 0.31}. We wish to calculate 

P(F) = P (E, U £ 3 (E 4 U E c 7 ) U E 2 {E c 5 U E 6 E 8 ) U E g E c 10 ) (4.12) 

There are 2 10 = 1024 minterm probabilities to be calculated. Each requires the multiplication of 
ten numbers. The solution with MATLAB is easy, as follows: 

> P = 0.01* [13 37 12 56 33 71 22 43 57 31]; 
3> minveclO 

Vectors are Al thru A10 and Ale thru AlOc 
They may be renamed, if desired. 

> F = (All (A3&(A4|A7c))) I (A2&(A5c| (A6&A8))) I (A9&A10c); 
3> pm = minprob (P); 
» PF = F*pm' 
PF = 0.6636 

Writing out the expression for F is tedious and error prone. We could simplify as follows: 
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> A = All (A3&(A4|A7c)); 

> B = A2&(A5c| (A6&A8)); 

> C = A9&A10c; 

3> F = A I B I C ; '/, This minterm vector is the same as for F above 

This decomposition of the problem indicates that it may be solved as a series of smaller problems. 
First, we need some central facts about independence of Boolean combinations. 



4,2,3 Independent Boolean combinations 

Suppose we have a Boolean combination of the events in the class {A^ : 1 < i < n) and a second combination 
the events in the class {Bj : 1 < j < m). If the combined class {Aj, Bj : 1 < i < n, 1 < j < m) is 
independent, we would expect the combinations of the subclasses to be independent. It is important to see 
that this is in fact a consequence of the product rule, for it is further evidence that the product rule has 
captured the essence of the intuitive notion of independence. In the following discussion, we exhibit the 
essential structure which provides the basis for the following general proposition. 

Proposition. Consider n distinct subclasses of an independent class of events. If for each i the event A; 
is a Boolean (logical) combination of members of the ith subclass, then the class {A\, A2, ■ ■ ■ , A n } is an 
independent class. 

Verification of this far reaching result rests on the minterm expansion and two elementary facts about 
the disjoint subclasses of an independent class. We state these facts and consider in each case an example 
which exhibits the essential structure. Formulation of the general result, in each case, is simply a matter of 
careful use of notation. 

1. A class each of whose members is a minterm formed by members of a distinct subclass of an independent 
class is itself an independent class. 

Example 4.14 

Consider the independent class {A 1; A 2 , A 3 , Bi, B 2 , B 3 , B4}, with respective probabilities 
0.4, 0.7, 0.3, 0.5, 0.8, 0.3, 0.6. Consider M 3 , minterm three for the class {Ax, A 2 , A 3 }, and 
JV5, minterm five for the class {B\, B 2 , B3, B4}. Then 

P(M 3 ) = P(AIA 2 A Z ) = 0.6 • 0.7 • 0.3 = 0.126 andP (N 5 ) = P (BIB 2 BIB 4 ) = (4.13) 
0.5 -0.8 -0.7 -0.6 = 0.168 

Also 

P(M 3 N 5 ) = P{A c 1 A 2 A 3 BfB 2 B^B 4 ) = 0.6 -0.7 -0.3 -0.5 -0.8 -0.7 -0.6 
= (0.6-0.7-0.3) -(0.5 -0.8 -0.7 -0.6) = P (M 3 ) P (N 5 ) = 0.0212 

The product rule shows the desired independence. 

Again, it should be apparent that the result holds for any number of A; and By, and it can be extended 
to any number of distinct subclasses of an independent class. 

2. Suppose each member of a class can be expressed as a disjoint union. If each auxiliary class formed by 
taking one member from each of the disjoint unions is an independent class, then the original class is 
independent. 

Example 4.15 

Suppose A = A\ \j A 2 \j A3 and B = B\\j B 2 , with {Aj, Bj} independent for each pair i,j. 
Suppose 

P(A X ) =0.3, J P(A 2 ) = 0.4, P(A 3 )= 0.1, P(£?i)=0.2, P(5 2 ) = 0.5 (4.15) 
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We wish to show that the pair {A, B} is independent; i.e., the product rule P (AB) = 
P{A)P{B) holds. 
COMPUTATION 

P(A) = P (A ± ) + P (A 2 ) + P (A 3 ) = 0.3 + 0.4 + 0.1 = 0.8 andP (B) = P (B x ) + (4.16) 
P(B 2 ) = 0.2 + 0.5 = 0.7 

Now 

AB = (a x \JA 2 \J A 3 ) (b x V B 2 ) = A 1 B 1 \/ A X B 2 \j A 2 B X \/ A 2 B 2 \/ A 3 B X \/ A 3 B 2 (4.17) 
By additivity and pairwise independence, we have 

P(AB) = P (Ax) P (5i) + P (A,) P (B 2 ) + P (A 2 ) P (B,) + P (A 2 ) P (B 2 ) + (4.18) 
P (A 3 ) P (£i) + P (A 3 ) P (B 2 ) = 0.3 • 0.2 + 0.3 • 0.5 + 0.4 • 0.2 + 0.4 • 0.5 + 0.1 • 
0.2 + 0.1 • 0.5 = 0.56 = P(A)P (B) 

The product rule can also be established algebraically from the expression for P(AB), as 
follows: 



P (AB) = P (Ai) [P (Bi) + P (B 2 )} + P (A 2 ) [P (Bi) + P (B 2 )} + P (A 3 ) [P (Bi) + P (B 2 )} 
= [P (A,) + P (A 2 ) + P (A 3 )] [P (Si) + P (B 2 )] = P(A)P (B) 

It should be clear that the pattern just illustrated can be extended to the general case. If 



(4.19) 



A = \J Ai and B = \J Bj, with each pair {Ai, Bj} independent (4.20) 

»=i j=i 

then the pair {^4, B} is independent. Also, we may extend this rule to the triple {A, B, C} 

n m r 

A = \J Ai, B = \J Bj, and C = \f C&, with each class {Ai, Bj, C^} independent (4.21) 
j=i j=i fe=i 

and similarly for any finite number of such combinations, so that the second proposition holds. 
3. Begin with an independent class £ of n events. Select m distinct subclasses and form Boolean combi- 
nations for each of these. Use of the minterm expansion for each of these Boolean combinations and 
the two propositions just illustrated shows that the class of Boolean combinations is independent To 
illustrate, we return to Example 4.13, which involves an independent class of ten events. 

Example 4.16: A hybrid approach 

Consider again the independent class {E\, E 2 , ■■■ , -Eio} with respective probabilities 
{0.13 0.37 0.12 0.56 0.33 0.71 0.22 0.43 0. 570.31}. We wish to calculate 

P(F)=P (E x U £ 3 (Ei U E c 7 ) U E 2 (E c 5 U E 6 E S ) U E 9 E C W ) (4.22) 

In the previous solution, we use minprob to calculate the 2 10 = 1024 minterms for all ten of 
the Ej and determine the minterm vector for F. As we note in the alternate expansion of F, 

F = AUBUC, where A = E t U E 3 (E 4 U£f) B = E 2 (E c 5 U E 6 E 8 ) C = E 9 E r { (4.23) 

We may calculate directly P (C) = 0.57 • 0.69 = 0.3933. Now A is a Boolean combination of 
{E\, E 3 , E4, E7} and B is a combination of {E 2 , E$, E§, E$}. By the result on independence 
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of Boolean combinations, the class {A, B, C} is independent. We use the m-procedures to 
calculate P (A) and P (B). Then we deal with the independent class {A, B, C} to obtain the 
probability of F. 

> p = 0.01*[13 37 12 56 33 71 22 43 57 31]; 



> pa = p([l 3 4 7]) 

> pb = p([2 5 6 8]) 
3> pma = minprob(pa) 
3> pmb = minprob(pb) 
3> minvec4; 

> a = A|(Bft(C|Dc)); 

> PA = a*pma' 
PA = 0.2243 

> b = A&(Bc| (C&D)); 

> PB = b*pmb' 
PB = 0.2852 

> PC = p(9)*(l - p(10)) 
PC = 0.3933 

> pm = minprob([PA PB PC]) 
3> minvec3 

> F = AIBIC; 

> PF = F*pm' 
PF = 0.6636 



°/. Selection of probabilities for A 

'/, Selection of probabilities for B 

°/. Minterm probabilities for calculating P(A) 

°/. Minterm probabilities for calculating P(B) 

'/, A corresponds to El, B to E3, C to E4, D to E7 



'/„ A corresponds to E2, B to E5, C to E6, D to EE 



'/, The problem becomes a three variable problem 
'/, with {A,B,C}- an independent class 



'/, Agrees with the result of Example~4.11 



4.3 Composite Trials 3 

4,3.1 Composite trials and component events 

Often a trial is a composite one. That is, the fundamental trial is completed by performing several steps. 
In some cases, the steps are carried out sequentially in time. In other situations, the order of performance 
plays no significant role. Some of the examples in the unit on Conditional Probability (Section 3.1) involve 
such multistep trials. We examine more systematically how to model composite trials in terms of events 
determined by the components of the trials. In the subsequent section, we illustrate this approach in the 
important special case of Bernoulli trials, in which each outcome results in a success or failure to achieve a 
specified condition. 

We call the individual steps in the composite trial component trials. For example, in the experiment of 
nipping a coin ten times, we refer the ith toss as the ith component trial. In many cases, the component 
trials will be performed sequentially in time. But we may have an experiment in which ten coins are nipped 
simultaneously. For purposes of analysis, we impose an ordering — usually by assigning indices. The question 
is how to model these repetitions. Should they be considered as ten trials of a single simple experiment? It 
turns out that this is not a useful formulation. We need to consider the composite trial as a single outcome — 
i.e., represented by a single point in the basic space Q,. 

Some authors give considerable attention the the nature of the basic space, describing it as a Cartesian 
product space, with each coordinate corresponding to one of the component outcomes. We find that unnec- 
essary, and often confusing, in setting up the basic model. We simply suppose the basic space has enough 
elements to consider each possible outcome. For the experiment of nipping a coin ten times, there must be 
at least 2 10 = 1024 elements, one for each possible sequence of heads and tails. 



3 This content is available online at <http://cnx.Org/content/m23256/l.6/>. 
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Of more importance is describing the various events associated with the experiment. We begin by 
identifying the appropriate component events. A component event is determined by propositions about the 
outcomes of the corresponding component trial. 

Example 4.17: Component events 

• In the coin flipping experiment, consider the event Hg that the third toss results in a head. 
Each outcome u> of the experiment may be represented by a sequence of H's and T's, repre- 
senting heads and tails. The event Hg consists of those outcomes represented by sequences 
with H in the third position. Suppose A is the event of a head on the third toss and a tail 
on the ninth toss. This consists of those outcomes corresponding to sequences with H in the 
third position and T in the ninth. Note that this event is the intersection H 3 Hg. 

• A somewhat more complex example is as follows. Suppose there are two boxes, each containing 
some red and some blue balls. The experiment consists of selecting at random a ball from 
the first box, placing it in the second box, then making a random selection from the modified 
contents of the second box. The composite trial is made up of two component selections. We 
may let Rj be the event of selecting a red ball on the first component trial (from the first 
box), and R2 be the event of selecting a red ball on the second component trial. Clearly Rj 
and R2 are component events. 

In the first example, it is reasonable to assume that the class {Hi : 1 < i < 10} is independent, and each 
component probability is usually taken to be 0.5. In the second case, the assignment of probabilities is 
somewhat more involved. For one thing, it is necessary to know the numbers of red and blue balls in each 
box before the composite trial begins. When these are known, the usual assumptions and the properties of 
conditional probability suffice to assign probabilities. This approach of utilizing component events is used 
tacitly in some of the examples in the unit on Conditional Probability. 

When appropriate component events are determined, various Boolean combinations of these can be 
expressed as minterm expansions. 

Example 4.18 

Four persons take one shot each at a target. Let E; be the event the ith shooter hits the target 
center. Let A3 be the event exacty three hit the target. Then A3 is the union of those minterms 
generated by the E; which have three places uncomplemented. 

A 3 = E 1 E 2 E 3 E c i \/ E 1 E 2 E c 3 E i \J E X E C 2 E 3 E± \J E{E 2 E 3 E^ (4.24) 

Usually we would be able to assume the E; form an independent class. If each P (Ei) is known, 
then all minterm probabilities can be calculated easily. 

The following is a somewhat more complicated example of this type. 

Example 4.19 

Ten race cars are involved in time trials to determine pole positions for an upcoming race. To 
qualify, they must post an average speed of 125 mph or more on a trial run. Let E; be the event 
the ith car makes qualifying speed. It seems reasonable to suppose the class {Ei : 1 < i < 10} is 
independent. If the respective probabilities for success are 0.90, 0.88, 0.93, 0.77, 0.85, 0.96, 0.72, 
0.83, 0.91, 0.84, what is the probability that k or more will qualify (k = 6, 7, 8, 9, 10)? 
SOLUTION 

Let Ak be the event exactly k qualify. The class {Ei : 1 < i < 10} generates 2 10 = 1024 minterms. 
The event A^ is the union of those minterms which have exactly k places uncomplemented. The 
event B^ that k or more qualify is given by 



B k = \/ A r (4.25) 



r—k 
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The task of computing and adding the minterm probabilities by hand would be tedious, to say the 
least. However, we may use the function ckn, introduced in the unit on MATLAB and Independent 
Classes and illustrated in Example 4.4.2, to determine the desired probabilities quickly and easily. 

> P = [0.90, 0.88, 0.93, 0.77, 0.85, 0.96,0.72, 0.83, 0.91, 0.84]; 

> k = 6:10; 

> PB = ckn(P,k) 

PB = 0.9938 0.9628 0.8472 0.5756 0.2114 

An alternate approach is considered in the treatment of random variables. 

4,3.2 Bernoulli trials and the binomial distribution 

Many composite trials may be described as a sequence of success-failure trials. For each component trial in 
the sequence, the outcome is one of two kinds. One we designate a success and the other a failure. Examples 
abound: heads or tails in a sequence of coin flips, favor or disapprove of a proposition in a survey sample, 
and items from a production line meet or fail to meet specifications in a sequence of quality control checks. 
To represent the situation, we let E; be the event of a success on the ith component trial in the sequence. 
The event of a failure on the ith component trial is thus Ef. 

In many cases, we model the sequence as a Bernoulli sequence, in which the results on the successive 
component trials are independent and have the same probabilities. Thus, formally, a sequence of success- 
failure trials is Bernoulli iff 

1. The class {Ei : 1 < i} is independent. 

2. The probability P (Ei) = p, invariant with i. 

Simulation of Bernoulli trials 

It is frequently desirable to simulate Bernoulli trials. By flipping coins, rolling a die with various numbers 
of sides (as used in certain games), or using spinners, it is relatively easy to carry this out physically. However, 
if the number of trials is large — say several hundred — the process may be time consuming. Also, there are 
limitations on the values of p, the probability of success. We have a convenient two-part m-procedure for 
simulating Bernoulli sequences. The first part, called btdata, sets the parameters. The second, called bt, 
uses the random number generator in MATLAB to produce a sequence of zeros and ones (for failures and 
successes). Repeated calls for bt produce new sequences. 

Example 4.20 

> btdata 

Enter n, the number of trials 10 

Enter p, the probability of success on each trial 0.37 

Call for bt 

> bt 
n = 10 p = 0.37 '/, n is kept small to save printout space 
Frequency = 0.4 

To view the sequence, call for SEQ 
3> disp(SEQ) °/. optional call for the sequence 



1 


1 


2 


1 


3 





4 





5 





6 





7 
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8 

9 1 
10 1 

Repeated calls for bt yield new sequences with the same parameters. 

To illustrate the power of the program, it was used to take a run of 100,000 component trials, with probability 
p of success 0.37, as above. Successive runs gave relative frequencies 0.37001 and 0.36999. Unless the random 
number generator is "seeded" to make the same starting point each time, successive runs will give different 
sequences and usually different relative frequencies. 
The binomial distribution 

A basic problem in Bernoulli sequences is to determine the probability of k successes in n component 
trials. We let S n be the number of successes in n trials. This is a special case of a simple random variable, 
which we study in more detail in the chapter on "Random Variables and Probabilities" (Section 6.1). 

Let us characterize the events A kn = {S n = k}, < k < n. As noted above, the event Ak n of exactly k 
successes is the union of the minterms generated by {Ei : 1 < i} in which there are k successes (represented 
by k uncomplemented E;) and n—k failures (represented by n — k complemented E; c ). Simple combinatorics 
show there are C (n, k) ways to choose the k places to be uncomplemented. Hence, among the 2 n minterms, 
there are C (n, k) = k un~kV. w hich have k places uncomplemented. Each such minterm has probability 

p k (l — p) n ~ ■ Since the minterms are mutually exclusive, their probabilities add. We conclude that 

P(S n = k) = C(n, k)p k (l-p) n ~ k = C{n, k)p k q n - k where q = \-p for < k < n (4.26) 

These probabilities and the corresponding values form the distribution for S n . This distribution is known 
as the binomial distribution, with parameters (n, p). We shorten this to binomial (n, p), and often write 
S n ~ binomial (n, p). A related set of probabilities is P (S n > k) = P {Bk n ), < k < n. If the number n of 
component trials is small, direct computation of the probabilities is easy with hand calculators. 

Example 4.21: A reliability problem 

A remote device has five similar components which fail independently, with equal probabilities. 
The system remains operable if three or more of the components are operative. Suppose each unit 
remains active for one year with probability 0.8. What is the probability the system will remain 
operative for that long? 
SOLUTION 

P = C (5, 3) 0.8 3 • 0.2 2 + C (5, 4) 0.8 4 • 0.2 + C (5, 5) 0.8 5 = 10 • 0.8 3 • 0.2 2 + 5 • 0.8 4 • (4.27) 
0.2 + 0.8 5 = 0.9421 

Because Bernoulli sequences are used in so many practical situations as models for success-failure trials, the 
probabilities P (S n = k) and P (S n > k) have been calculated and tabulated for a variety of combinations 
of the parameters (n, p). Such tables are found in most mathematical handbooks. Tables of P (S n = k) are 
usually given a title such as binomial distribution, individual terms. Tables of P (S n > k) have a designation 
such as binomial distribution, cumulative terms. Note, however, some tables for cumulative terms give 
P {S n < k). Care should be taken to note which convention is used. 

Example 4.22: A reliability problem 

Consider again the system of Example 5, above. Suppose we attempt to enter a table of Cumulative 
Terms, Binomial Distribution at n = 5, k = 3, and p = 0.8. Most tables will not have probabilities 
greater than 0.5. In this case, we may work with failures. We just interchange the role of E; and 
E; c . Thus, the number of failures has the binomial (n, q) distribution. Now there are three or more 
successes iff there are not three or more failures. We go the the table of cumulative terms at n = 5, 
k = 3, and p = 0.2. The probability entry is 0.0579. The desired probability is 1 - 0.0579 = 0.9421. 

In general, there are k or more successes in n trials iff there are not n — k + 1 or more failures. 
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m-functions for binomial probabilities 

Although tables are convenient for calculation, they impose serious limitations on the available parameter 
values, and when the values are found in a table, they must still be entered into the problem. Fortunately, 
we have convenient m-functions for these distributions. When MATLAB is available, it is much easier to 
generate the needed probabilities than to look them up in a table, and the numbers are entered directly into 
the MATLAB workspace. And we have great freedom in selection of parameter values. For example we may 
use n of a thousand or more, while tables are usually limited to n of 20, or at most 30. The two m-functions 
for calculating P (Ak n ) and P (Bk n ) are 

: P (Akn) is calculated by y = ibinom(n,p,k) , where k is a row or column vector of integers between and 

n. The result y is a row vector of the same size as k. 
: P [Bkn) is calculated by y = cbinom(n,p,k) , where k is a row or column vector of integers between and 

n. The result y is a row vector of the same size as k. 

Example 4.23: Use of m-functions ibinom and cbinom 

If n = 10 and p = 0.39, determine P (Ak n ) and P (B^n) for k = 3, 5, 6, 8. 

> p = 0.39; 
> k = [3568]; 

3> Pi = ibinom(10,p,k) '/, individual probabilities 
Pi = 0.2237 0.1920 0.1023 0.0090 

3> Pc = cbinom(10,p,k) '/, cumulative probabilities 
Pc = 0.8160 0.3420 0.1500 0.0103 

Note that we have used probability p = 0.39. It is quite unlikely that a table will have this probability. 
Although we use only n = 10, frequently it is desirable to use values of several hundred. The m-functions 
work well for n up to 1000 (and even higher for small values of p or for values very near to one). Hence, 
there is great freedom from the limitations of tables. If a table with a specific range of values is desired, an 
m-procedure called binomial produces such a table. The use of large n raises the question of cumulation of 
errors in sums or products. The level of precision in MATLAB calculations is sufficient that such roundoff 
errors are well below pratical concerns. 

Example 4.24 



3> binomial 
Enter n, the number of trials 13 
Enter p, the probability of success 0.413 
Enter row vector k of success numbers 0:4 



7, call for procedure 



n 




p 




13. 


,0000 


0.4130 






k 


P(X-k) 


P(X>=k) 







0.0010 


1.0000 


1. 


,0000 


0.0090 


0.9990 


2. 


,0000 


0.0379 


0.9900 


3, 


,0000 


0.0979 


0.9521 


4. 


,0000 


0.1721 


0.8542 



Remark. While the m-procedure binomial is useful for constructing a table, it is usually not as convenient 
for problems as the m-functions ibinom or cbinom. The latter calculate the desired values and put them 
directly into the MATLAB workspace. 
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4.3.3 Joint Bernoulli trials 

Bernoulli trials may be used to model a variety of practical problems. One such is to compare the results of 
two sequences of Bernoulli trials carried out independently. The following simple example illustrates the use 
of MATLAB for this. 

Example 4.25: A joint Bernoulli trial 

Bill and Mary take ten basketball free throws each. We assume the two seqences of trials are 
independent of each other, and each is a Bernoulli sequence. 

Mary: Has probability 0.80 of success on each trial. 

Bill: Has probability 0.85 of success on each trial. 

What is the probability Mary makes more free throws than Bill? 

SOLUTION 

We have two Bernoulli sequences, operating independently. 

Mary: n = 10, p = 0.80 

Bill: n= 10, p = 0.85 

Let 

M be the event Mary wins 

Mjj be the event Mary makes k or more freethrows. 

Bj be the event Bill makes exactly j freethrows 

Then Mary wins if Bill makes none and Mary makes one or more, or Bill makes one and Mary 
makes two or more, etc. Thus 



M 



BoMiSjBiMty ■■■ \/B 9 M w 



(4.28) 



and 



P{M) = P (B ) P (Mi) + P (Bi) P (M 2 ) 



P(B 9 )P(M 10 ) 



(4.29) 



We use cbinom to calculate the cumulative probabilities for Mary and ibinom to obtain the indi- 
vidual probabilities for Bill. 

'/, cumulative probabilities for Mary 
°L individual probabilities for Bill 
7, display: pm in the first column 
'/, pb in the second column 



2> pm = 


cbinom(10,0.8,l:10) 


pb = ibinom(10, 0.85,0: 9) ; 


D = [pm; 


pb]' 


D = 




1.0000 


0.0000 


1.0000 


0.0000 


0.9999 


0.0000 


0.9991 


0.0001 


0.9936 


0.0012 


0.9672 


0.0085 


0.8791 


0.0401 


0.6778 


0.1298 


0.3758 


0.2759 


0.1074 


0.3474 



To find the probability P (M) that Mary wins, we need to multiply each of these pairs together, then 
sum. This is just the dot or scalar product, which MATLAB calculates with the command pm*pb . 
We may combine the generation of the probabilities and the multiplication in one command: 



> P = cbinom(10,0.8,l:10)*ibinom(10,0.85,0:9) ; 
P = 0.2738 



95 

The ease and simplicity of calculation with MATLAB make it feasible to consider the effect of different 
values of n. Is there an optimum number of throws for Mary? Why should there be an optimum? 

An alternate treatment of this problem in the unit on Independent Random Variables utilizes techniques 
for independent simple random variables. 

4,3.4 Alternate MATLAB implementations 

Alternate implementations of the functions for probability calculations are found in the Statistical Package 
available as a supplementary package. We have utilized our formulation, so that only the basic MATLAB 
package is needed. 

4.4 Problems on Independence of Events 4 

Exercise 4.1 (Solution on p. 101.) 

The minterms generated by the class {A, B, C} have minterm probabilities 

pm = [0.15 0.05 0.02 0.18 0.25 0.05 0.18 0.12] (4.30) 

Show that the product rule holds for all three, but the class is not independent. 

Exercise 4.2 (Solution on p. 101.) 

The class {A, B, C, D} is independent, with respective probabilities 0.65, 0.37, 0.48, 0.63. Use 
the m-function minprob to obtain the minterm probabilities. Use the m-function minmap to put 
them in a 4 by 4 table corresponding to the minterm map convention we use. 

Exercise 4.3 (Solution on p. 101.) 

The minterm probabilities for the software survey in Example 2 (Example 2.2: Survey on software) 
from "Minterms" are 

pm= [0 0.05 0.10 0.05 0.20 0.10 0.40 0.10] (4.31) 

Show whether or not the class {A, B, C} is independent: (1) by hand calculation, and (2) by use 
of the m-function imintest. 

Exercise 4.4 (Solution on p. 101.) 

The minterm probabilities for the computer survey in Example 3 (Example 2.3: Survey on personal 
computers) from "Minterms" are 

pm = [0.032 0.016 0.376 0.011 0.364 0.073 0.077 0.051] (4.32) 

Show whether or not the class {A, B, C} is independent: (1) by hand calculation, and (2) by use 
of the m-function imintest. 

Exercise 4.5 (Solution on p. 101.) 

Minterm probabilities p (0) through p(15) for the class {A, B, C, D} are, in order, 

pm = [0.084 0.196 0.036 0.084 0.085 0.196 0.035 0.084 0.021 0.049 0.009 0.021 0.020 0.049 0.010 0.021] 
Use the m-function imintest to show whether or not the class {A, B, C, D} is independent. 

Exercise 4.6 (Solution on p. 102.) 

Minterm probabilities p(0) through p(15) for the opinion survey in Example 4 (Example 2.4: 
Opinion survey) from "Minterms" are 

pm = [0.085 0.195 0.035 0.085 0.080 0.200 0.035 0.085 0.020 0.050 0.010 0.020 0.020 0.050 0.015 0.015] 
Show whether or not the class {A, B, C, D} is independent. 



4 This content is available online at <http://cnx.Org/content/m24180/l.4/>. 
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Exercise 4.7 (Solution on p. 102.) 

The class {A,B,C} is independent, with P {A) = 0.30, P {B C C) = 0.32, and P {AC) = 0.12. 
Determine the minterm probabilities. 

Exercise 4.8 (Solution on p. 102.) 

The class {A, B, C} is independent, with P{AuB) = 0.6, P{AuC) = 0.7, and P (C) = 0.4. 
Determine the probability of each minterm. 

Exercise 4.9 (Solution on p. 102.) 

A pair of dice is rolled five times. What is the probability the first two results are "sevens" and the 
others are not? 

Exercise 4.10 (Solution on p. 102.) 

David, Mary, Joan, Hal, Sharon, and Wayne take an exam in their probability course. Their 
probabilities of making 90 percent or more are 

0.72 0.83 0.75 0.92 0.65 0.79 (4.33) 

respectively. Assume these are independent events. What is the probability three or more, four 
or more, five or more make grades of at least 90 percent? 

Exercise 4.11 (Solution on p. 102.) 

Two independent random numbers between and 1 are selected (say by a random number generator 
on a calculator). What is the probability the first is no greater than 0.33 and the other is at least 
57? 

Exercise 4.12 (Solution on p. 102.) 

Helen is wondering how to plan for the weekend. She will get a letter from home (with money) 
with probability 0.05. There is a probability of 0.85 that she will get a call from Jim at SMU in 
Dallas. There is also a probability of 0.5 that William will ask for a date. What is the probability 
she will get money and Jim will not call or that both Jim will call and William will ask for a date? 

Exercise 4.13 (Solution on p. 103.) 

A basketball player takes ten free throws in a contest. On her first shot she is nervous and has 
probability 0.3 of making the shot. She begins to settle down and probabilities on the next seven 
shots are 0.5, 0.6 0.7 0.8 0.8, 0.8 and 0.85, respectively. Then she realizes her opponent is doing 
well, and becomes tense as she takes the last two shots, with probabilities reduced to 0.75, 0.65. 
Assuming independence between the shots, what is the probability she will make k or more for 
fc = 2,3,---10? 
Exercise 4.14 (Solution on p. 103.) 

In a group there are M men and W women; m of the men and w of the women are college 
graduates. An individual is picked at random. Let A be the event the individual is a woman and B 
be the event he or she is a college graduate. Under what condition is the pair {A, B} independent? 

Exercise 4.15 (Solution on p. 103.) 

Consider the pair {A, B} of events. Let P (A) = p, P (A c ) = q = 1 - p, P (B\A) = p x , and 
P (B\A C ) = p2- Under what condition is the pair {A, B} independent? 

Exercise 4.16 (Solution on p. 103.) 

Show that if event A is independent of itself, then P (A) = or P (A) = 1. (This fact is key to an 
important "zero-one law.") 

Exercise 4.17 (Solution on p. 103.) 

Does {A, B} independent and {B, C} independent imply {A, C} is independent? Justify your 
answer. 
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Exercise 4.18 (Solution on p. 103.) 

Suppose event A implies B (i.e. A C B). Show that if the pair {^4, B} is independent, then either 
P (A) = or P (B) = 1. 

Exercise 4.19 (Solution on p. 103.) 

A company has three task forces trying to meet a deadline for a new device. The groups work 
independently, with respective probabilities 0.8, 0.9, 0.75 of completing on time. What is the 
probability at least one group completes on time? (Think. Then solve "by hand.") 

Exercise 4.20 (Solution on p. 103.) 

Two salesmen work differently. Roland spends more time with his customers than does Betty, hence 
tends to see fewer customers. On a given day Roland sees five customers and Betty sees six. The 
customers make decisions independently. If the probabilities for success on Roland's customers are 
0.7,0.8,0.8,0.6,0.7 and for Betty's customers are 0.6,0.5,0.4,0.6,0.6,0.4, what is the probability 
Roland makes more sales than Betty? What is the probability that Roland will make three or more 
sales? What is the probability that Betty will make three or more sales? 

Exercise 4.21 (Solution on p. 104.) 

Two teams of students take a probability exam. The entire group performs individually and 
independently. Team 1 has five members and Team 2 has six members. They have the following 
indivudal probabilities of making an '"A" on the exam. 

Team 1: 0.83 0.87 0.92 0.77 0.86 Team 2: 0.68 0.91 0.74 0.68 0.73 0.83 

a. What is the probability team 1 will make at least as many A's as team 2? 

b. What is the probability team 1 will make more A's than team 2? 

Exercise 4.22 (Solution on p. 104.) 

A system has five components which fail independently. Their respective reliabilities are 0.93, 0.91, 
0.78, 0.88, 0.92. Units 1 and 2 operate as a "series" combination. Units 3, 4, 5 operate as a two 
of three subsytem. The two subsystems operate as a parallel combination to make the complete 
system. What is reliability of the complete system? 

Exercise 4.23 (Solution on p. 104.) 

A system has eight components with respective probabilities 

0.96 0.90 0.93 0.82 0.85 0.97 0.88 0.80 (4.34) 

Units 1 and 2 form a parallel subsytem in series with unit 3 and a three of five combination of 
units 4 through 8. What is the reliability of the complete system? 

Exercise 4.24 (Solution on p. 104.) 

How would the reliability of the system in Exercise 4.23 change if units 1, 2, and 3 formed a parallel 
combination in series with the three of five combination? 

Exercise 4.25 (Solution on p. 104.) 

How would the reliability of the system in Exercise 4.23 change if the reliability of unit 3 were 
changed from 0.93 to 0.96? What change if the reliability of unit 2 were changed from 0.90 to 0.95 
(with unit 3 unchanged)? 

Exercise 4.26 (Solution on p. 105.) 

Three fair dice are rolled. What is the probability at least one will show a six? 

Exercise 4.27 (Solution on p. 105.) 

A hobby shop finds that 35 percent of its customers buy an electronic game. If customers buy 
independently, what is the probability that at least one of the next five customers will buy an 
electronic game? 
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Exercise 4.28 (Solution on p. 105.) 

Under extreme noise conditions, the probability that a certain message will be transmitted correctly 
is 0.1. Successive messages are acted upon independently by the noise. Suppose the message is 
transmitted ten times. What is the probability it is transmitted correctly at least once? 

Exercise 4.29 (Solution on p. 105.) 

Suppose the class {A^ : 1 < i < n} is independent, with P (Aj) = p t , 1 < i < n. What is the 
probability that at least one of the events occurs? What is the probability that none occurs? 

Exercise 4.30 (Solution on p. 105.) 

In one hundred random digits, through 9, with each possible digit equally likely on each choice, 
what is the probility 8 or more are sevens? 

Exercise 4.31 (Solution on p. 105.) 

Ten customers come into a store. If the probability is 0.15 that each customer will buy a television 
set, what is the probability the store will sell three or more? 

Exercise 4.32 (Solution on p. 105.) 

Seven similar units are put into service at time t = 0. The units fail independently. The probability 
of failure of any unit in the first 400 hours is 0.18. What is the probability that three or more units 
are still in operation at the end of 400 hours? 

Exercise 4.33 (Solution on p. 105.) 

A computer system has ten similar modules. The circuit has redundancy which ensures the system 
operates if any eight or more of the units are operative. Units fail independently, and the probability 
is 0.93 that any unit will survive between maintenance periods. What is the probability of no system 
failure due to these units? 

Exercise 4.34 (Solution on p. 105.) 

Only thirty percent of the items from a production line meet stringent requirements for a special 
job. Units from the line are tested in succession. Under the usual assumptions for Bernoulli trials, 
what is the probability that three satisfactory units will be found in eight or fewer trials? 

Exercise 4.35 (Solution on p. 105.) 

The probability is 0.02 that a virus will survive application of a certain vaccine. What is the 
probability that in a batch of 500 viruses, fifteen or more will survive treatment? 

Exercise 4.36 (Solution on p. 105.) 

In a shipment of 20,000 items, 400 are defective. These are scattered randomly throughout the 
entire lot. Assume the probability of a defective is the same on each choice. What is the probability 
that 

1. Two or more will appear in a random sample of 35? 

2. At most five will appear in a random sample of 50? 

Exercise 4.37 (Solution on p. 105.) 

A device has probability p of operating successfully on any trial in a sequence. What probability 
p is necessary to ensure the probability of successes on all of the first four trials is 0.85? With that 
value of p, what is the probability of four or more successes in five trials? 

Exercise 4.38 (Solution on p. 105.) 

A survey form is sent to 100 persons. If they decide independently whether or not to reply, 
and each has probability 1/4 of replying, what is the probability of k or more replies, where 
k= 15,20,25,30,35,40? 

Exercise 4.39 (Solution on p. 105.) 

Ten numbers are produced by a random number generator. What is the probability four or more 
are less than or equal to 0.63? 
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Exercise 4.40 (Solution on p. 105.) 

A player rolls a pair of dice five times. She scores a "hit" on any throw if she gets a 6 or 7. She wins 
iff she scores an odd number of hits in the five throws. What is the probability a player wins on any 
sequence of five throws? Suppose she plays the game 20 successive times. What is the probability 
she wins at least 10 times? What is the probability she wins more than half the time? 

Exercise 4.41 (Solution on p. 106.) 

Erica and John spin a wheel which turns up the integers through 9 with equal probability. Results 
on various trials are independent. Each spins the wheel 10 times. What is the probability Erica 
turns up a seven more times than does John? 

Exercise 4.42 (Solution on p. 106.) 

Erica and John play a different game with the wheel, above. Erica scores a point each time she gets 
an integer 0, 2, 4, 6, or 8. John scores a point each time he turns up a 1, 2, 5, or 7. If Erica spins 
eight times; John spins 10 times. What is the probability John makes more points than Erica? 

Exercise 4.43 (Solution on p. 106.) 

A box contains 100 balls; 30 are red, 40 are blue, and 30 are green. Martha and Alex select at 
random, with replacement and mixing after each selection. Alex has a success if he selects a red 
ball; Martha has a success if she selects a blue ball. Alex selects seven times and Martha selects 
five times. What is the probability Martha has more successes than Alex? 

Exercise 4.44 (Solution on p. 106.) 

Two players roll a fair die 30 times each. What is the probability that each rolls the same number 
of sixes? 

Exercise 4.45 (Solution on p. 106.) 

A warehouse has a stock of n items of a certain kind, r of which are defective. Two of the items are 
chosen at random, without replacement. What is the probability that at least one is defective? Show 
that for large n the number is very close to that for selection with replacement, which corresponds 
to two Bernoulli trials with pobability p = r/n of success on any trial. 

Exercise 4.46 (Solution on p. 106.) 

A coin is flipped repeatedly, until a head appears. Show that with probability one the game will 
terminate. 

TIP: The probability of not terminating in n trials is q n . 

Exercise 4.47 (Solution on p. 106.) 

Two persons play a game consecutively until one of them is successful or there are ten unsuccesful 
plays. Let E; be the event of a success on the ith play of the game. Suppose {E t : 1 < i} is an 
independent class with P (Ei) = p\ for i odd and P (Ei) = p2 for i even. Let A be the event the 
first player wins, B be the event the second player wins, and C be the event that neither wins. 

a. Express A, B, and C in terms of the E;. 

b. Determine P (A), P (B), and P (C) in terms of p 1 , p 2 , Qi = 1 — Pi, and q 2 = 1 — pi- Obtain 
numerical values for the case p\ = 1/4 and p^ = 1/3. 

c. Use appropriate facts about the geometric series to show that P (A) = P (B) iff p\ = 

P2/(l+P2)- 

d. Suppose pi = 0.5. Use the result of part (c) to find the value of pi to make P (A) = P (B) 
and then determine P(A), P{B), and P(C). 

Exercise 4.48 (Solution on p. 106.) 

Three persons play a game consecutively until one achieves his objective. Let E; be the event of 
a success on the ith trial, and suppose {Ei : 1 < i} is an independent class, with P (Ei) = p\ for 
i = 1,4,7, •••, P{Ei) = p 2 for i = 2,5,8, ■■■, and P (E { ) = p 3 for i = 3,6,9,---. Let A,B,Cbe the 
respective events the first, second, and third player wins. 
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a. Express A, B, and C in terms of the E\. 

b. Determine the probabilities in terms of pi,P2,P3, then obtain numerical values in the case 
Pi = 1/4, p 2 = 1/3, and p 3 = 1/2. 

Exercise 4.49 (Solution on p. 107.) 

What is the probability of a success on the ith trial in a Bernoulli sequence of n component trials, 
given there are r successes? 

Exercise 4.50 (Solution on p. 107.) 

A device has JV similar components which may fail independently, with probability p of failure of 
any component. The device fails if one or more of the components fails. In the event of failure of 
the device, the components are tested sequentially. 

a. What is the probability the first defective unit tested is the nth, given one or more components 
have failed? 

b. What is the probability the defective unit is the nth, given that exactly one has failed? 

c. What is the probability that more than one unit has failed, given that the first defective unit 
is the nth? 
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Solutions to Exercises in Chapter 4 

Solution to Exercise 4.1 (p. 95) 

pm = [0.15 0.05 0.02 0.18 0.25 0.05 0.18 0.12]; 
y = imintest(pm) 
The class is NOT independent 
Minterms for which the product rule fails 

y = 

1110 

1 1 1 '/. The product rule hold for M7 = ABC 

Solution to Exercise 4.2 (p. 95) 





P = [0. 


65 


0.37 0.48 





,63]; 






p 
p 


= minmapf 


[minprob(P) ) 










0.0424 




0.0249 


0. 


,0788 


0. 


.0463 




0.0722 




0.0424 


0. 


.1342 


0. 


.0788 




0.0392 




0.0230 


0. 


.0727 


0. 


,0427 




0.0667 




0.0392 





,1238 


0. 


,0727 



Solution to Exercise 4.3 (p. 95) 

pm = [0 0.05 0.10 0.05 0.20 0.10 0.40 0.10]; 
y = imintest(pm) 
The class is NOT independent 
Minterms for which the product rule fails 

y = 

1 1 1 1 '/, By hand check product rule for any minterm 

1111 

Solution to Exercise 4.4 (p. 95) 

npr04_04 (Section~17. 8 . 22: npr04_04) 
Minterm probabilities for Exercise~4.4 are in pm 
y = imintest(pm) 
The class is NOT independent 
Minterms for which the product rule fails 

y = 

1111 
1111 

Solution to Exercise 4.5 (p. 95) 

npr04_05 (Section~17. 8 . 23: npr04_05) 
Minterm probabilities for Exercise~4.5 are in pm 
imintest (pm) 
The class is NOT independent 
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Minterms for which the product rule fails 
ans = 






1 





1 

















1 





1 















Solution to Exercise 4.6 (p. 95) 

npr04_06 
Minterm probabilities for Exercise~4.6 are in pm 
y = imintest(pm) 
The class is NOT independent 
Minterms for which the product rule fails 

y = 

1111 
1111 
1111 
1111 

Solution to Exercise 4.7 (p. 96) 

P (C) = P (AC) jP (A) = 0.40 and P (B) = 1 - P [B C C) /P (C) = 0.20. 

pm = minprob([0.3 0.2 0.4]) 
pm = 0.3360 0.2240 0.0840 0.0560 0.1440 0.0960 0.0360 0.0240 

Solution to Exercise 4.8 (p. 96) 

P (A C C C ) = P {A c ) P {C c ) = 0.3 implies P {A c ) = 0.3/0.6 = 0.5 = P {A). 

P (A C B C ) = P {A c ) P {B c ) = 0.4 implies P (B c ) = 0.4/0.5 = 0.8 implies P (B) = 0.2 

P = [0.5 0.2 0.4] ; 
pm = minprob(P) 
pm = 0.2400 0.1600 0.0600 0.0400 0.2400 0.1600 0.0600 0.0400 



Solution to Exercise 4.9 (p. 96) 

P= (l/6) 2 (5/6) 3 = 0.0161. 
Solution to Exercise 4.10 (p. 96) 



P = 0.01* [72 83 75 92 65 79]; 
y = ckn(P,[3 4 5]) 
y = 0.9780 0.8756 0.5967 

Solution to Exercise 4.11 (p. 96) 

P= 0.33 »(1 -0.57) = 0.1419 
Solution to Exercise 4.12 (p. 96) 

A ~ letter with money, B ~ call from Jim, C ~ William ask for date 

P = 0.01*[5 85 50] ; 
minvec3 

Variables are A, B, C, Ac, Bc, Cc 
They may be renamed, if desired, 
pm = minprob(P); 
p = ((A&Bc) I (B&C))*pm' 
p = 0.4325 
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Solution to Exercise 4.13 (p. 96) 

P = 0.01* [30 50 60 70 80 80 80 85 75 65]; 
k = 2:10; 
p = ckn(P,k) 

P = 

Columns 1 through 7 

0.9999 0.9984 0.9882 0.9441 0.8192 0.5859 0.3043 
Columns 8 through 9 

0.0966 0.0134 

Solution to Exercise 4.14 (p. 96) 

P (A\B) = w/{m + w) = W/ (W + M) = P {A) 
Solution to Exercise 4.15 (p. 96) 

Pi = P (B\A) = P (B\A C ) = p2 (see table of equivalent conditions). 
Solution to Exercise 4.16 (p. 96) 

P{A) = P{AC\A) = P {A) P {A), x 2 = x iff x = or x = 1. 
Solution to Exercise 4.17 (p. 96) 

'/, No. Consider for example the following minterm probabilities: 
pm = [0.2 0.05 0.125 0.125 0.05 0.2 0.125 0.125]; 
minvec3 

Variables are A, B, C, Ac, Be, Cc 
They may be renamed, if desired. 
PA = A*pm' 



PA = 


■■ 0.5000 


PB = 


: B*pm' 


PB = 


0.5000 


PC = 


C*pm' 


PC = 


0.5000 


PAB 


= (A&B)*pm' 


PAB 


= 0.2500 


PBC 


= (B&C)*pm' 


PBC 


= 0.2500 


PAC 


= (A&C)*pm' 


PAC 


= 0.3250 



'/, Product rule holds 
'/, Product rule holds 
'/, Product rule fails 

Solution to Exercise 4.18 (p. 97) 

Ac B implies P (AB) = P {A); independence implies P (AB) = P {A) P (B). P {A) = P {A) P (B) only if 
P(B) = 1 or P(A) = 0. 
Solution to Exercise 4.19 (p. 97) 

At least one completes iff not all fail. P = 1 - 0.2 • 0.1 • 0.25 = 0.9950 
Solution to Exercise 4.20 (p. 97) 

PR = 0.1* [7 8 8 6 7]; 
PB = 0.1* [6 5 4 6 6 4] ; 
PR3 = ckn(PR,3) 
PR3 = 0.8662 
PB3 = ckn(PB,3) 
PB3 = 0.6906 
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PRgB = ikn(PB,0:4)*ckn(PR,l:5)' 
PRgB = 0.5065 

Solution to Exercise 4.21 (p. 97) 



PI = 0.01* [83 87 92 77 86] ; 
P2 = 0.01* [68 91 74 68 73 83]; 
Plgeq = ikn(P2,0:5)*ckn(Pl,0:5)' 
Plgeq = 0.5527 
Pig = ikn(P2,0:4)*ckn(Pl,l:5)' 
Pig = 0.2561 

Solution to Exercise 4.22 (p. 97) 



R = 0.01*[93 91 78 88 92]; 
Ra = prod(R(l:2)) 
Ra = 0.8463 
Rb = ckn(R(3:5),2) 
Rb = 0.9506 
Rs = parallel ([Ra Rb] ) 
Rs = 0.9924 

Solution to Exercise 4.23 (p. 97) 



R = 0.01* [96 90 93 82 85 97 88 80]; 
Ra = parallel (R(l: 2)) 
Ra = 0.9960 
Rb = ckn(R(4:8),3) 
Rb = 0.9821 
Rs = prod([Ra R(3) Rb] ) 
Rs = 0.9097 

Solution to Exercise 4.24 (p. 97) 



Re = parallel (R(l: 3)) 
Re = 0.9997 
Rss = prod([Rb Re]) 
Rss = 0.9818 

Solution to Exercise 4.25 (p. 97) 



Rl = R; 
Rl(3) =0.96; 
Ra = parallel (Rl (1:2)) 
Ra = 0.9960 
Rb = ckn(Rl(4:8),3) 
Rb = 0.9821 

Rs3 = prod([Ra Rl(3) Rb] ) 
Rs3 = 0.9390 
W.2. = H j 
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R2(2) = 0.95; 

Ra = parallel (R2 (1:2)) 

Ra = 0.9980 

Rb = ckn(R2(4:8),3) 

Rb = 0.9821 

Rs4 = prod([Ra R2(3) Rb] ) 

Rs4 = 0.9115 

Solution to Exercise 4.26 (p. 97) 

P= 1- (5/6) 3 = 0.4213 
Solution to Exercise 4.27 (p. 97) 

P = 1 - 0.65 5 = 0.8840 
Solution to Exercise 4.28 (p. 98) 

P= 1-0.9 10 = 0.6513 
Solution to Exercise 4.29 (p. 98) 

pi = i-po, po = n?=i(i-ix) 

Solution to Exercise 4.30 (p. 98) 

P = cbinom (100, 0.1, 8) = 0.7939 
Solution to Exercise 4.31 (p. 98) 

P = cbinom (10, 0.15, 3) = 0.1798 
Solution to Exercise 4.32 (p. 98) 

P = cbinom (7, 0.82, 3) = 0.9971 
Solution to Exercise 4.33 (p. 98) 

P = cbinom (10, 0.93, 8) = 0.9717 
Solution to Exercise 4.34 (p. 98) 

P = cbinom (8, 0.3, 3) = 0.4482 
Solution to Exercise 4.35 (p. 98) 

P = cbinom (500.0.02, 15) = 0.0814 
Solution to Exercise 4.36 (p. 98) 

• PI = cbinom (35, 0.02, 2) = 0.1547. 

• P2 = 1 - cbinom (35, 0.02, 6) = 0.9999 

Solution to Exercise 4.37 (p. 98) 

p = 0.85 1/4 , P= cbinom (5, p, 4) = 0.9854 
Solution to Exercise 4.38 (p. 98) 



P = cbinom(100, 1/4, 15:5:40) 
0.9946 0.9005 0.5383 0.1495 0.0164 0.0007 



Solution to Exercise 4.39 (p. 98) 

PI = cbinom (10, 0.63, 4) = 0.9644 
Solution to Exercise 4.40 (p. 99) 



Each roll yields a hit with probability p = -^ + -^ - 


n 

36 


PW = sum(ibinom(5, 11/36, [1 3 5])) 




PW = 0.4956 




P2 = cbinom(20,PW,10) 




P2 = 0.5724 




P3 = cbinom(20,PW,ll) 




P3 = 0.3963 
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Solution to Exercise 4.41 (p. 99) 

P = ibinom (10, 0.1,0 : 9) * cbinom (10, 0.1, 1 : 10) ' = 0.3437 
Solution to Exercise 4.42 (p. 99) 

P = ibinom (8, 0.5, 0:8)* cbinom (10, 0.4, 1:9)' = 0.4030 
Solution to Exercise 4.43 (p. 99) 

P = ibinom (7, 0.3, 0:4)* cbinom (5, 0.4, 1:5)' = 0.3613 
Solution to Exercise 4.44 (p. 99) 

P = sum (ibinom (30, 1/6,0 : 30). "2) = 0.1386 
Solution to Exercise 4.45 (p. 99) 

r r— 1 r n—r n — r r (2n — 1) r — r 2 

PI = - . + - . + . = ^ ; (4.35) 

n n—1 n n—1 n n—1 n [n — 1) 

„ / r \ 2 2nr — r 2 , 

P2 = 1 - ( - ) = = (4.36) 

Solution to Exercise 4.46 (p. 99) 

Let TV = event never terminates and TVfc = event does not terminate in k plays. Then TV C TV& for all k 
implies < P (TV) < P (N k ) = lj2 k for all k. We conclude P (TV) = 0. 
Solution to Exercise 4.47 (p. 99) 

a- C = n-£i El 

A TP \ / Tpc Tpc TP \ I TP C TP C TP C TP C TP \ I TP C TP C TP C TP C TP C TP C TP \ I TP C TP C TP C TP C TP C TP C TP C TP C TP (A QT^ 

A — fc l V ^1-^2-^3 V ^\^ 2 ^^A^b V bj \ tj 2 bj -i tj A tj 5 tj § tj 7 V tj \ t "l tj Z tj 4, tj S tj § tj 1 tj % tj $ (, 4 -3<J 

D T?C TP \ I TPC TPC TPC Tp \ / TpC TpC TpC pC TpC Tp \ I TpC JpC pC pC pC pC pC 7^ \ / TpC pC pC pC TpC TpC TpC TpC TpC JP 

t> — hj 1 tj2 Y £ J l- tj 2- tj 3- tj 4 V tj l tj 2 tj Z tj i tj b tj H V tj l tj 2 tj i tj A tj b tj & tj 7 tj S y &l&2 tjl A tj A tj b tj & tj 7 tj S, tj: a tj W 

(4.38) 

b. 

P (A) = Pl \i + qi q 2 + ( M2 ) 2 + (?ig 2 ) 3 + (gig 2 ) 4 l = Pi 1 ~ {qiq2) (4.39) 

l J 1 — qiq 2 

P (73) = q lP2 1 ~^^ P (C) = ( qiq2 f (4.40) 

1 — 9l92 

For pi = 1/4, p 2 = 1/3, we have q \ q2 = 1/2 and <jip 2 = 1/4. In this case 

P (A) = - • — = 31/64 = 0.4844 = P{B),P (C) = 1/32 (4.41) 

4 16 

Note that P {A) + P (73) + P (C) = 1. 

c. P(A)=P (73) iff pi = qiP2 = {l-pi)p 2 ittpi=p 2 /{l+ P2 ). 

d. Pl = 0.5/1.5 = 1/3 

Solution to Exercise 4.48 (p. 99) 

a. • A = E x \j \j fttiEfE 3k+1 

k = \ 

. b = e{e 2 \j v n-=l 1 ^^3 fe+2 

fe=l 

00 

. c = ezeze 3 \/ v n?=J ^3*+3 

fe=l 

k _ Pl 



b. • P(A) =PlEfc=o (919293 



. P (73) - — ^ 



1-9192 93 

1-919293 
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. p(C)= i qiq2P3 
\ ' 1-919293 

. For Pl = 1/4, p 2 = 1/3, P3 = 1/2, P (A) = P (B) = P (C) = 1/3. 

Solution to Exercise 4.49 (p. 100) 

P{A rn Ei) =pC{n- l,r- l)V~V~ r and P{A rn ) = C (n,r)p r q n - r . 

Hence P (Ei\AArn) = C (n — 1, r — 1) jC (n, r) = r/n. 
Solution to Exercise 4.50 (p. 100) 

Let Ai = event one failure, E>i = event of one or more failures, B 2 = event of two or more failures, and 
F n = the event the first defective unit found is the nth. 

a. F n C B x implies P {F n \B x ) = P (F n ) /P (B{) = g£ 

'•W.)-^¥-'^=-i (4.42) 



P(Ai_) NpqK- 1 N 

(see Exercise 4.49) 



c. Since probability not all from nth are good is 1 — q 



N- 



P|W .^!. ^n . 1 .^ ,4.43, 

P{F n ) q n !p 
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Chapter 5 

Conditional Independence 



5.1 Conditional Independence 1 

The idea of stochastic (probabilistic) independence is explored in the unit Independence of Events (Sec- 
tion 4.1). The concept is approached as lack of conditioning: P (A\B) = P(A). This is equivalent to the 
product rule P (AB) = P (A) P (B) . We consider an extension to conditional independence. 

5.1.1 The concept 

Examination of the independence concept reveals two important mathematical facts: 

• Independence of a class of non mutually exclusive events depends upon the probability measure, and 
not on the relationship between the events. Independence cannot be displayed on a Venn diagram, 
unless probabilities are indicated. For one probability measure a pair may be independent while for 
another probability measure the pair may not be independent. 

• Conditional probability is a probability measure, since it has the three defining properties and all those 
properties derived therefrom. 

This raises the question: is there a useful conditional independence — i.e., independence with respect to a 
conditional probability measure? In this chapter we explore that question in a fruitful way. 

Among the simple examples of "operational independence" in the unit on independence of events, which 
lead naturally to an assumption of "probabilistic independence" are the following: 



• 



• 



If customers come into a well stocked shop at different times, each unaware of the choice made by the 
other, the the item purchased by one should not be affected by the choice made by the other. 
If two students are taking exams in different courses, the grade one makes should not affect the grade 
made by the other. 

Example 5.1: Buying umbrellas and the weather 

A department store has a nice stock of umbrellas. Two customers come into the store "indepen- 
dently." Let A be the event the first buys an umbrella and B the event the second buys an umbrella. 
Normally, we should think the events {A, B} form an independent pair. But consider the effect of 
weather on the purchases. Let C be the event the weather is rainy (i.e., is raining or threatening 
to rain). Now we should think P (A\C) > P {A\C C ) and P(B\C) > P{B\C C ). The weather has 
a decided effect on the likelihood of buying an umbrella. But given the fact the weather is rainy 



lr This content is available online at <http://cnx.Org/content/m23258/l.7/>. 
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(event C has occurred), it would seem reasonable that purchase of an umbrella by one should not 
affect the likelihood of such a purchase by the other. Thus, it may be reasonable to suppose 

P (A\C) = P (A\BC) or, in another notation, P c {A) = P c (A\B) (5.1) 

An examination of the sixteen equivalent conditions for independence, with probability measure P 
replaced by probability measure Pc, shows that we have independence of the pair {A, B} with re- 
spect to the conditional probability measure P c (•) = P (-|C). Thus, P (AB\C) = P (A\C) P (B\C). 
For this example, we should also expect that P (A\C C ) = P (A\BC C ), so that there is independence 
with respect to the conditional probability measure P(-\C C ). Does this make the pair {A, B} in- 
dependent (with respect to the prior probability measure P)? Some numerical examples make it 
plain that only in the most unusual cases would the pair be independent. Without calculations, 
we can see why this should be so. If the first customer buys an umbrella, this indicates a higher 
than normal likelihood that the weather is rainy, in which case the second customer is likely to 
buy. The condition leads to P (B\A) > P{B). Consider the following numerical case. Suppose 
P (AB\C) = P (A\C) P (B\C) and P {AB\C C ) = P {A\C C ) P {B\C C ) and 



P (A\C) = 0.60, P {A\C C ) = 0.20, P (B\C) = 0.50, P {B\C C ) = 0.15, withP (C) = 0.30. (5.2) 



Then 



P(A) = 

P(B\C C 



P{A\C)P{C) -} 
P(C C ) =0.2550 



P(A\C C )P(C C ) = 0.3200 P (B) = P(B\C)P{C) 



(5.3) 



P (AB) = P (AB\C) P(C) + P (AB\C C 
P (A\C C ) P (B\C C ) P (C c ) = 0.1110 



P(C C ) 



P (A\C) P (B\C) P (C) 



(5.4) 



As a result, 



P {A) P (B) = 0.0816 ^ 0.1110 = P {AB) 



(5.5) 



The product rule fails, so that the pair is not independent. An examination of the pattern of 
computation shows that independence would require very special probabilities which are not likely 
to be encountered. 

Example 5.2: Students and exams 

Two students take exams in different courses, Under normal circumstances, one would suppose their 
performances form an independent pair. Let A be the event the first student makes grade 80 or 
better and B be the event the second has a grade of 80 or better. The exam is given on Monday 
morning. It is the fall semester. There is a probability 0.30 that there was a football game on 
Saturday, and both students are enthusiastic fans. Let C be the event of a game on the previous 
Saturday. Now it is reasonable to suppose 



P (A\C) = P (A\BC) and P {A\C C ) = P {A\BC C ) (5.6) 

If we know that there was a Saturday game, additional knowledge that B has occurred does not 
affect the lielihood that A occurs. Again, use of equivalent conditions shows that the situation may 
be expressed 

P (AB\C) = P (A\C) P (B\C) and P {AB\C C ) = P {A\C C ) P {B\C C ) (5.7) 

Under these conditions, we should suppose that P (A\C) < P {A\C C ) and P (B\C) < P{B\C C ). 
If we knew that one did poorly on the exam, this would increase the likelihoood there was a 



Ill 



Saturday game and hence increase the likelihood that the other did poorly. The failure to be 
independent arises from a common chance factor that affects both. Although their performances 
are "operationally" independent, they are not independent in the probability sense. As a numerical 
example, suppose 

P (A\C) = 0.7 P {A\C C ) = 0.9 P (B\C) = 0.6 P {B\C C ) = 0.8 P (C) = 0.3 (5. 

Straightforward calculations show P (A) = 0.8400, P {B) = 0.7400, P {AB) = 0.6300. Note that 
P(A\B) = 0.8514 > P(A) as would be expected. 



5.1.2 Sixteen equivalent conditions 

Using the facts on repeated conditioning (Section 3.1.4: Repeated conditioning) and the equivalent condi- 
tions for independence (Table 4.1), we may produce a similar table of equivalent conditions for conditional 
independence. In the hybrid notation we use for repeated conditioning, we write 



This translates into 



P c (A\B) = P c {A) or P c {AB) = P c {A) P c (B) 



P {A\BC) = P (A\C) or P (AB\C) = P (A\C) P (B\C) 



(5.9) 



(5.10) 



If it is known that C has occurred, then additional knowledge of the occurrence of B does not change the 
likelihood of A. 

If we write the sixteen equivalent conditions for independence in terms of the conditional probability 
measure Pc ( • ), then translate as above, we have the following equivalent conditions. 

Sixteen equivalent conditions 





P (A\BC) = P (A\C) 


P (B\AC) = P (B\C) 


P (AB\C) = P (A\C) P (B\C) 






P{A\B C C) = P{A\C) 


P{B C \AC) = P{B C \C) 


P (AB C \C) = P (A\C) P {B C \C) 






P {A C \BC) = P {A C \C) 


P{B\A C C) = P{B\C) 


P (A C B\C) = P {A C \C) P (B\C) 






P{A C \B C C) = P{A C \C) 


P{B C \A C C) = P(B C \C) 


P (A C B C \C) = P (A C \C) P {B C \C) 






Table 5.1 




P{A\BC) = P{A\B C C) 


P{A C \BC) 

P{A C \B C C) 


P {B\AC) = P {B\A C C) 


P{B C \AC) 
P(B C \A C C) 


= 



Table 5.2 

The patterns of conditioning in the examples above belong to this set. In a given problem, one or the 
other of these conditions may seem a reasonable assumption. As soon as one of these patterns is recognized, 
then all are equally valid assumptions. Because of its simplicity and symmetry, we take as the defining 
condition the product ruleP {AB\C) = P (A\C) P (B\C). 

Definition. A pair of events {^4, B} is said to be conditionally independent, givenC, designated 
{A, B} ci \C iff the following product rule holds: P (AB\C) = P (A\C) P (B\C). 

The equivalence of the four entries in the right hand column of the upper part of the table, establish 

The replacement rule 

If any of the pairs {A,B}, {A, B c }, {A C ,B}, or {A C ,B C } is conditionally independent, given C, then so 
are the others. 
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This may be expressed by saying that if a pair is conditionally independent, we may replace either or 
both by their complements and still have a conditionally independent pair. 

To illustrate further the usefulness of this concept, we note some other common examples in which similar 
conditions hold: there is operational independence, but some chance factor which affects both. 

• Two contractors work quite independently on jobs in the same city. The operational independence 
suggests probabilistic independence. However, both jobs are outside and subject to delays due to bad 
weather. Suppose A is the event the first contracter completes his job on time and B is the event the 
second completes on time. If C is the event of "good" weather, then arguments similar to those in 
Examples 1 and 2 make it seem reasonable to suppose {A, B}ci\C and {A, B}ci\C c . Remark. In 
formal probability theory, an event must be sharply defined: on any trial it occurs or it does not. The 
event of "good weather" is not so clearly defined. Did a trace of rain or thunder in the area constitute 
bad weather? Did rain delay on one day in a month long project constitute bad weather? Even with 
this ambiguity, the pattern of probabilistic analysis may be useful. 

• A patient goes to a doctor. A preliminary examination leads the doctor to think there is a thirty percent 
chance the patient has a certain disease. The doctor orders two independent tests for conditions that 
indicate the disease. Are results of these tests really independent? There is certainly operational 
independence — the tests may be done by different laboratories, neither aware of the testing by the 
others. Yet, if the tests are meaningful, they must both be affected by the actual condition of the 
patient. Suppose D is the event the patient has the disease, A is the event the first test is positive 
(indicates the conditions associated with the disease) and B is the event the second test is positive. 
Then it would seem reasonable to suppose {A, B}ci \D and {A, B}ci \D C . 

In the examples considered so far, it has been reasonable to assume conditional independence, given an event 
C, and conditional independence, given the complementary event. But there are cases in which the effect of 
the conditioning event is asymmetric. We consider several examples. 

• Two students are working on a term paper. They work quite separately. They both need to borrow 
a certain book from the library. Let C be the event the library has two copies available. If A is the 
event the first completes on time and B the event the second is successful, then it seems reasonable 
to assume {A, B}ci \C. However, if only one book is available, then the two conditions would not be 
conditionally independent. In general P (B\AC C ) < P(B\C C ), since if the first student completes on 
time, then he or she must have been successful in getting the book, to the detriment of the second. 

• If the two contractors of the example above both need material which may be in scarce supply, then 
successful completion would be conditionally independent, give an adequate supply, whereas they would 
not be conditionally independent, given a short supply. 

• Two students in the same course take an exam. If they prepared separately, the event of both getting 
good grades should be conditionally independent. If they study together, then the likelihoods of good 
grades would not be independent. With neither cheating or collaborating on the test itself, if one does 
well, the other should also. 

Since conditional independence is ordinary independence with respect to a conditional probability measure, 
it should be clear how to extend the concept to larger classes of sets. 

Definition. A class {A t : i e J}, where J is an arbitrary index set, is conditionally independent , given 
event C, denoted {Ai : i s J} ci \C, iff the product rule holds for every finite subclass of two or more. 

As in the case of simple independence, the replacement rule extends. 

The replacement rule 

If the class {Ai : i s J} ci \C, then any or all of the events A; may be replaced by their complements and 
still have a conditionally independent class. 
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5.1.3 The use of independence techniques 

Since conditional independence is independence, we may use independence techniques in the solution of 
problems. We consider two types of problems: an inference problem and a conditional Bernoulli sequence. 

Example 5.3: Use of independence techniques 

Sharon is investigating a business venture which she thinks has probability 0.7 of being successful. 
She checks with five "independent" advisers. If the prospects are sound, the probabilities are 0.8, 
0.75, 0.6, 0.9, and 0.8 that the advisers will advise her to proceed; if the venture is not sound, the 
respective probabilities are 0.75, 0.85, 0.7, 0.9, and 0.7 that the advice will be negative. Given 
the quality of the project, the advisers are independent of one another in the sense that no one is 
affected by the others. Of course, they are not independent, for they are all related to the soundness 
of the venture. We may reasonably assume conditional independence of the advice, given that the 
venture is sound and also given that the venture is not sound. If Sharon goes with the majority of 
advisers, what is the probability she will make the right decision? 
SOLUTION 

If the project is sound, Sharon makes the right choice if three or more of the five advisors are 
positive. If the venture is unsound, she makes the right choice if three or more of the five advisers 
are negative. Let H = the event the project is sound, F = the event three or more advisers are 
positive, G = F c = the event three or more are negative, and E = the event of the correct decision. 
Then 

P{E) = P (FH) + P {GH C ) = P (F\H) P (H) + P {G\H C ) P {H c ) (5.11) 

Let Ej be the event the ith adviser is positive. Then P(F\H) = the sum of probabilities of the 
form P(Mk\H), where M^ are minterms generated by the class {Ei : 1 < i < 5}. Because of the 
assumed conditional independence, 

P (E 1 E c 2 E c 3 E i E 5 \H) = P (E^H) P {E C 2 \H) P (E C 3 \H) P (E 4 \H) P (E 5 \H) (5.12) 

with similar expressions for each P [M^\H) and P (M},\H C ). This means that if we want the 
probability of three or more successes, given H, we can use ckn with the matrix of conditional 
probabilities. The following MATLAB solution of the investment problem is indicated. 

Pl~=~0.01*[80~75~60~90~80] ; 
P2~=~0.01*[75~85~70~90~70] ; 
PH~=~0.7; 

PE~=~ckn(Pl,3)*PH~+~ckn(P2,3)*(l~-~PH) 
PE~= 0.9255 

Often a Bernoulli sequence is related to some conditioning event H. In this case it is reasonable to assume 
the sequence {Ei : 1 < i < n} ci \H and ci \H C . We consider a simple example. 

Example 5.4: Test of a claim 

A race track regular claims he can pick the winning horse in any race 90 percent of the time. In 
order to test his claim, he picks a horse to win in each of ten races. There are five horses in each 
race. If he is simply guessing, the probability of success on each race is 0.2. Consider the trials to 
constitute a Bernoulli sequence. Let H be the event he is correct in his claim. If S is the number 
of successes in picking the winners in the ten races, determine P (H\S = k) for various numbers k 
of correct picks. Suppose it is equally likely that his claim is valid or that he is merely guessing. 
We assume two conditional Bernoulli trials: 

Claim is valid: Ten trials, probability p = P (Ei\H) = 0.9. 

Guessing at random: Ten trials, probability p = P (Ei\H c ) = 0.2. 
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Let S = number of correct picks in ten trials. Then 

P{H\S = k) P{H) P(S = k\H) 



P(H c \S=k) P{H C ) P{S = k\H c Y 

Giving him the benefit of the doubt, we suppose P (H) jP (H c 
odds. 



< k < 10 (5.13) 

= 1 and calculate the conditional 



k~=~0:10; 
Pkl~=~ibinom(10,0.9,k) ; 
Pk2~=~ibinom(10,0.2,k) ; 

0H~~=~Pkl./Pk2; ' 

e~~~=~0H~>~l; 

disp(round([k(e) ;0H(e)] ')) 

6 2~~ 

7 73~~ 

8 2627~~' 

9 94585 

10 3405063 



"°/.~Probability~of ~k~successes , ~given~H 
"7,~Probability~of ~k~successes , ~given~H~c 
"/„~Conditional~odds--~Assumes~P(H)/P(H~c)~=~l 
~°/.~Selects~f avorable~odds 



'°/.~Needs~at "least ~six~to~have~ credit ability 
'°/o~Seven~would~be~ credit able, 
"/."even" if ~P (H) /P (H~c) ~=~0 . 1 



Under these assumptions, he would have to pick at least seven correctly to give reasonable validation 
of his claim. 



5.2 Patterns of Probable Inference 2 
5.2.1 Some Patterns of Probable Inference 

We are concerned with the likelihood of some hypothesized condition. In general, we have evidence for the 
condition which can never be absolutely certain. We are forced to assess probabilities (likelihoods) on the 
basis of the evidence. Some typical examples: 



HYPOTHESIS 


EVIDENCE 


Job success 


Personal traits 


Presence of oil 


Geological structures 


Operation of a device 


Physical condition 


Market condition 


Test market condition 


Presence of a disease 


Tests for symptoms 



Table 5.3 

If H is the event the hypothetical condition exists and E is the event the evidence occurs, the probabilities 
available are usually P (H) (or an odds value), P(E\H), and P(E\H C ). What is desired is P (H\E) or, 
equivalently, the odds P (H\E) jP (H C \E). We simply use Bayes' rule to reverse the direction of conditioning. 



P(H\E) P(E\H) P(H) 



P{H C \E) P(E\H C ) P(H C ) 
No conditional independence is involved in this case. 



(5.14) 



2 This content is available online at <http://cnx.Org/content/m23259/l.6/>. 
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Independent evidence for the hypothesized condition 

Suppose there are two "independent" bits of evidence. Now obtaining this evidence may be "opera- 
tionally" independent, but if the items both relate to the hypothesized condition, then they cannot be really 
independent. The condition assumed is usually of the form P (E\\H) = P(Ei\HE 2 ) — if H occurs, then 
knowledge of E 2 does not affect the likelihood of Ej. Similarly, we usually have P (E\\H C ) = P (Ei\H c E 2 ). 
Thus {E u E 2 } ci \H and {E ll E 2 } ci \H C . 

Example 5.5: Independent medical tests 

Suppose a doctor thinks the odds are 2/1 that a patient has a certain disease. She orders two 
independent tests. Let H be the event the patient has the disease and Ej and E 2 be the events 
the tests are positive. Suppose the first test has probability 0.1 of a false positive and probability 
0.05 of a false negative. The second test has probabilities 0.05 and 0.08 of false positive and false 
negative, respectively. If both tests are positive, what is the posterior probability the patient has 
the disease? 
SOLUTION 
Assuming {E\, E 2 } ci \H and ci \H C , we work first in terms of the odds, then convert to probability. 



P{H\E 1 E 2 



P{H) P{E 1 E 2 \H) P(H) P (Et\H) P {E 2 \H) 



P(iJ c |E 1 S 2 ) P(H C ) P{E l E 2 \H c ) P{H C ) P (£i|ff c ) P {E 2 \H C ) 



(5.15) 



The data are 



P (H) jP (H c ) = 2, P {E X \H) = 0.95, P (E X \H C ) = 0.1, P (E 2 \H) = 0.92, P (E 2 \H C ) = 0.05 (5.16) 
Substituting values, we get 

5 



P(H\E 1 E 2 ) 0.95-0.92 1748 , „.„,„„, 1748 

- 2 • __ = so that P {H\EiE 2 ) = -^^ = 1 



P{H C \E 1 E 2 ) 



0.10-0.05 



1753 



1753 



1 - 0.0029 (5.17) 



Evidence for a symptom 

Sometimes the evidence dealt with is not evidence for the hypothesized condition, but for some condition 
which is stochastically related. For purposes of exposition, we refer to this intermediary condition as a 
symptom. Consider again the examples above. 



HYPOTHESIS 


SYMPTOM 


EVIDENCE 


Job success 


Personal traits 


Diagnostic test results 


Presence of oil 


Geological structures 


Geophysical survey results 


Operation of a device 


Physical condition 


Monitoring report 


Market condition 


Test market condition 


Market survey result 


Presence of a disease 


Physical symptom 


Test for symptom 



Table 5.4 

We let S be the event the symptom is present. The usual case is that the evidence is directly related to 
the symptom and not the hypothesized condition. The diagnostic test results can say something about an 
applicant's personal traits, but cannot deal directly with the hypothesized condition. The test results would 
be the same whether or not the candidate is successful in the job (he or she does not have the job yet). A 
geophysical survey deals with certain structural features beneath the surface. If a fault or a salt dome is 
present, the geophysical results are the same whether or not there is oil present. The physical monitoring 
report deals with certain physical characteristics. Its reading is the same whether or not the device will fail. 
A market survey treats only the condition in the test market. The results depend upon the test market, not 
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the national market. A blood test may be for certain physical conditions which frequently are related (at 
least statistically) to the disease. But the result of the blood test for the physical condition is not directly 
affected by the presence or absence of the disease. 
Under conditions of this type, we may assume 

P {E\SH) = P {E\SH C ) and P (E\S C H) = P {E\S C H C ) (5.18) 

These imply {E, H} ci \S and ci \S C . Now 



P(H\E) _ P(HE) _ P(HES) + P{HES r -) P(HS)P(E\HS) + P(HS C )P(E\HS' : ) 

P(H C \E) ~ P(H C E) ~ P(H C ES) + P(H C ES") ~ P(H C S)P(E\H C S) + P(H C S { =)P(E\H C S c 

P(HS)P(E\S)+P(HS C )P(E\S C ) 
~ P(_ff c S)P(B|S)+P(H c S c )P(S|S c ) 



(5.19) 



It is worth noting that each term in the denominator differs from the corresponding term in the numerator 
by having H c in place of H. Before completing the analysis, it is necessary to consider how H and S are 
related stochastically in the data. Four cases may be considered. 

a. Data are P (S\H), P {S\H C ), and P (H). 

b. Data are P (S\H), P (S\H C ), and P (S). 

c. Data are P(H\S), P(H\S C ), and P(S). 

d. Data are P(H\S), P(H\S C ), and P(H). 



P (H\E) P (H) P (S\H) P (E\S) + P{H)P {S C \H) P {E\S C ) 



P {H C \E) P {H c ) P {S\H C ) P (E\S) + P {H c ) P (S C \H C ) P (E\S C ) 



(5.20) 



Example 5.6: Geophysical survey 

Let H be the event of a successful oil well, S be the event there is a geophysical structure 
favorable to the presence of oil, and E be the event the geophysical survey indicates a favorable 
structure. We suppose {H, E}ci \S and ci \S C . Data are 

P (H) IP {H c ) = 3, P {S\H) = 0.92, P {S\H C ) = 0.20, P (E\S) = 0.95, P {E\S C ) = 0.15 (5.21) 
Then 

P(H\E) 0.92 -0.95 + 0.08- 0.15 = 1329 = ^ 

P{H C \E) 0.20 -0.95 + 0.80 -0.15 155 ' ' K ' ' 

155 
so that P (HIE) = 1 = 0.8956 (5.23) 

v ' ; 1484 v ' 

The geophysical result moved the prior odds of 3/1 to posterior odds of 8.6/1, with a corre- 
sponding change of probabilities from 0.75 to 0.90. 

Case b: Data are P{S)P{S\H), P{S\H C ), P{E\S). and P{E\S C ). If we can determine P{H), we can 
proceed as in case a. Now by the law of total probability 

P(S)=P (S\H) P(H) + P (S\H C ) [1-P (H)] (5.24) 

which may be solved algebraically to give 

p( S )-p(sm 

F{U) - P(S\H)-P(S\H°) {5 - 25) 
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Example 5.7: Geophysical survey revisited 

In many cases a better estimate of P (S) or the odds P (S) /P (S c ) can be made on the basis 
of previous geophysical data. Suppose the prior odds for S are 3/1, so that P (S) = 0.75. 
Using the other data in Example 5.6 (Geophysical survey), we have 

_ P(S)-P(S\H°) _ 0.75-0.20 _ P (H) _ 

F {H) ~ P (S\H) - P (S\H°) ~ 0.92 - 0.20 " 55/?2 ' & ° * ha * P~W) ~ ' ( ' 

Using the pattern of case a, we have 

P(H\E) _ 55 0.92-0.95 + 0.08-0.15 _ 4873 
P(H C \E) ~ 17 ' 0.20-0.95 + 0.80-0.15 ~ ~527~ 



9.2467 (5.27) 



527 
so that P {H\E) = 1 - — — = 0.9024 (5.28) 

Usually data relating test results to symptom are of the form P (E\S) and P(E\S C ), or equivalent. 
Data relating the symptom and the hypothesized condition may go either way. In cases a and b, the 
data are in the form P (S\H) and P (S\H C ), or equivalent, derived from data showing the fraction of 
times the symptom is noted when the hypothesized condition is identified. But these data may go in 
the opposite direction, yielding P (H\S) and P(H\S C ), or equivalent. This is the situation in cases c 
and d. 
Case c: Data are P (E\S) , P {E\S C ) , P (H\S) , P {H\S C ) and P {S). 

Example 5.8: Evidence for a disease symptom with prior P (S) 

When a certain blood syndrome is observed, a given disease is indicated 93 percent of the 
time. The disease is found without this syndrome only three percent of the time. A test 
for the syndrome has probability 0.03 of a false positive and 0.05 of a false negative. A 
preliminary examination indicates a probability 0.30 that a patient has the syndrome. A test 
is performed; the result is negative. What is the probability the patient has the disease? 
SOLUTION 
In terms of the notation above, the data are 

P (5) = 0.30, P {E\S C ) = 0.03, P {E C \S) = 0.05, (5.29) 

P(H\S) =0.93, and P {H\S C ) = 0.03 (5.30) 

We suppose {H, E} ci \S and ci \S C . 

P {H\E C ) P (5) P {H\S) P {E C \S) + P {S c ) P {H\S C ) P {E C \S C ) 



P {H C \E C ) P (5) P (H C \S) P {E C \S) + P (S c ) P {H C \S C ) P {E C \S C 
0.30 • 0.93 • 0.05 + 0.70 • 0.03 • 0.97 429 



(5.31) 
(5.32) 



0.30 • 0.07 • 0.05 + 0.70 • 0.97 • 0.97 8246 

which implies P{H\E C ) = 429/8675 w 0.05. 

Case d: This differs from case c only in the fact that a prior probability for H is assumed. In this case, we 
determine the corresponding probability for S by 



_ P(H)-P(H\S C ) 
^ (b) ~ P(H\S)-P(H\S°) ( ' 



and use the pattern of case c. 
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Example 5.9: Evidence for a disease symptom with prior P (H) 

Suppose for the patient in Example 5.8 (Evidence for a disease symptom with prior P (S)) the 
physician estimates the odds favoring the presence of the disease are 1/3, so that P (H) = 0.25. 
Again, the test result is negative. Determine the posterior odds, given E c . 
SOLUTION 
First we determine 

P(H)-P(H\S C ) 0.25-0.03 
P {S) = P(H\S)-P(H\S<) = 0.93-0.03 = U/45 (5 " 34) 

Then 

P(H\E C ) (11/45) -0.93 -0.05 + (34/45) -0.03-0.97 15009 



P{H C \E C ) (11/45) -0.07 -0.05 + (34/45) -0.97-0.97 320291 
The result of the test drops the prior odds of 1/3 to approximately 1/21. 



0.047 (5.35) 



Independent evidence for a symptom 

In the previous cases, we consider only a single item of evidence for a symptom. But it may be desirable 
to have a "second opinion." We suppose the tests are for the symptom and are not directly related to the 
hypothetical condition. If the tests are operationally independent, we could reasonably assume 

P{E 1 \SE 2 ) = P(E 1 \SE%) {E 1 ,E 2 }ci\S 

P(E 1 \SH) = P(E 1 \SH C ) {E u H}d\S 

P(E 2 \SH) = P(E 2 \SH C ) {E 2 ,H}ci\S 

P{E 1 E 2 \SH) = P{E 1 E 2 \SH C ) {E X E 2 , H}ci\S 

This implies {Ex, E 2 , H}ci \S. A similar condition holds for S c . As for a single test, there are four cases, 
depending on the tie between S and H. We consider a "case a" example. 

Example 5.10: A market survey problem 

A food company is planning to market nationally a new breakfast cereal. Its executives feel confident 
that the odds are at least 3 to 1 the product would be successful. Before launching the new product, 
the company decides to investigate a test market. Previous experience indicates that the reliability 
of the test market is such that if the national market is favorable, there is probability 0.9 that the 
test market is also. On the other hand, if the national market is unfavorable, there is a probability 
of only 0.2 that the test market will be favorable. These facts lead to the following analysis. Let 

H be the event the national market is favorable (hypothesis) 

S be the event the test market is favorable (symptom) 
The initial data are the following probabilities, based on past experience: 

• (a) Prior odds: P (H) /P {H c ) = 3 

• (b) Reliability of the test market: P (S\H) = 0.9 P {S\H C ) = 0.2 

If it were known that the test market is favorable, we should have 

P(H\S) = P(S\H)P(H) ^0.9 

P(H C \S) P(S\H C )P{H C ) 0.2 ' { ' ' 

Unfortunately, it is not feasible to know with certainty the state of the test market. The company 
decision makers engage two market survey companies to make independent surveys of the test 
market. The reliability of the companies may be expressed as follows. Let 

: Ej be the event the first company reports a favorable test market. 
: E 2 be the event the second company reports a favorable test market. 
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On the basis of previous experience, the reliability of the evidence about the test market (the 
symptom) is expressed in the following conditional probabilities. 

P(E 1 \S) = 0.9 P(E 1 \S C ) = 0.3 P (E 2 \S) = 0.8 B{E 2 \S C ) = 0.2 (5.38) 

Both survey companies report that the test market is favorable. What is the probability the 
national market is favorable, given this result? 
SOUTION 

The two survey firms work in an "operationally independent" manner. The report of either company 
is unaffected by the work of the other. Also, each report is affected only by the condition of the 
test market — regardless of what the national market may be. According to the discussion above, 
we should be able to assume 

{E u E 2 , H) ci \S and {E u E 2 , H} ci |S* C (5.39) 

We may use a pattern similar to that in Example 2, as follows: 

P(H\E!E 2 ) P(H) P (S\H) P (E t \S) P {E 2 \S) + P {S C \H) P (E t \S c ) P {E 2 \S C ) 



P(H C \E 1 E 2 ) P{H C ) P (S\H C ) P (ExlS) P (E 2 \S) + P (S C \H C ) P (E!\S C ) P {E 2 \S C 



(5.40) 



= 3 • 0-9 -0-9 -0-8 + 0-1 -0.3 -0.2 = 327 ^ 

0.2 -0.9 -0.8 + 0.8 -0.3 -0.2 32 v ' 

In terms of the posterior probability, we have 

P (H\E lE2) = 327 ^ = ™ = 1 - *L « 0.91 (5.42) 

v ' y 1 + 327/32 359 359 v ' 

We note that the odds favoring H, given positive indications from both survey companies, is 10.2 
as compared with the odds favoring H, given a favorable test market, of 13.5. The difference 
reflects the residual uncertainty about the test market after the market surveys. Nevertheless, the 
results of the market surveys increase the odds favoring a satisfactory market from the prior 3 to 
1 to a posterior 10.2 to 1. In terms of probabilities, the market surveys increase the likelihood 
of a favorable market from the original P (H) = 0.75 to the posterior P (H\EiE 2 ) = 0.91. The 
conditional independence of the results of the survey makes possible direct use of the data. 



5.2,2 A classification problem 

A population consists of members of two subgroups. It is desired to formulate a battery of questions to aid 
in identifying the subclass membership of randomly selected individuals in the population. The questions 
are designed so that for each individual the answers are independent, in the sense that the answers to 
any subset of these questions are not affected by and do not affect the answers to any other subset of 
the questions. The answers are, however, affected by the subgroup membership. Thus, our treatment of 
conditional idependence suggests that it is reasonable to supose the answers are conditionally independent, 
given the subgroup membership. Consider the following numerical example. 

Example 5.11: A classification problem 

A sample of 125 subjects is taken from a population which has two subgroups. The subgroup 
membership of each subject in the sample is known. Each individual is asked a battery of ten 
questions designed to be independent, in the sense that the answer to any one is not affected by 
the answer to any other. The subjects answer independently. Data on the results are summarized 
in the following table: 
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GROUP 


1 (69 members) 


GROUP 2 (56 members) 


Q 


Yes 


No 


Unc. 


Yes 


No 


Unc. 


1 


42 


22 


5 


20 


31 


5 


2 


34 


27 


8 


16 


37 


3 


3 


15 


45 


9 


33 


19 


4 


4 


19 


44 


6 


31 


18 


7 


5 


22 


43 


4 


23 


28 


5 


6 


41 


13 


15 


14 


37 


5 


7 


9 


52 


8 


31 


17 


8 


8 


40 


26 


3 


13 


38 


5 


9 


48 


12 


9 


27 


24 


5 


10 


20 


37 


12 


35 


16 


5 



Table 5.5 

Assume the data represent the general population consisting of these two groups, so that the 
data may be used to calculate probabilities and conditional probabilities. 

Several persons are interviewed. The result of each interview is a "profile" of answers to the 
questions. The goal is to classify the person in one of the two subgroups on the basis of the profile 
of answers. 

The following profiles were taken. 

• Y, N, Y, N, Y, U, N, U, Y. U 

• N, N, U, N, Y, Y, U, N, N, Y 

• Y, Y, N, Y, U, U, N, N, Y, Y 



Classify each individual in one of the subgroups. 
SOLUTION 

Let G\ = the event the person selected is from group 1, and G 2 = G\ = the event the person 
selected is from group 2. Let 

Ai = the event the answer to the ith question is "Yes" 
Bi = the event the answer to the ith question is "No" 
Ci = the event the answer to the ith question is "Uncertain" 
The data are taken to mean P (Ai|Gi) = 42/69, P (B 3 \G 2 ) = 19/56, etc. The profile 
Y, N, Y, N, Y, U, N, U, Y. U corresponds to the event E = A 1 B 2 A 3 B 4 A 5 C 6 B 7 C 8 A 9 C W 
We utilize the ratio form of Bayes' rule to calculate the posterior odds 



P(Gi|£) P{E\G{) P(Gi) 



P(G 2 \E) P(E\G 2 ) P(G 2 ) 

If the ratio is greater than one, classify in group 1; otherwise classify in group 2 (we assume that 
a ratio exactly one is so unlikely that we can neglect it). Because of conditional independence, we 
are able to determine the conditional probabilities 



(5.43) 



„^,^x 42-27- 15 -44 -22- 15 -52 -3 -48- 12 
P(E\Gi) = -tt^ and 



10 



P(E\G 2 ) 



69 



29-37-33- 18-23-5- 17-5-24-5 
56 15 



(5.44) 
(5.45) 
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5.85 (5.46) 



The odds P (Gi) /P (G 2 ) = 69/56. We find the posterior odds to be 

P (Gi\E) _ 42 • 27 • 15 • 44 • 22 • 15 • 52 • 3 • 48 • 12 56 9 
P(G 2 \E) ~ 29 -37- 33 -18 -23 -5- 17- 5 -24 -5 ' 69^ 

The factor 56 9 /69 9 comes from multiplying 56 10 /69 10 by the odds P (Gi) /P (G 2 ) = 69/56. Since 
the resulting posterior odds favoring Group 1 is greater than one, we classify the respondent in 
group 1. 

While the calculations are simple and straightforward, they are tedious and error prone. To 
make possible rapid and easy solution, say in a situation where successive interviews are underway, 
we have several m-procedures for performing the calculations. Answers to the questions would 
normally be designated by some such designation as Y for yes, N for no, and U for uncertain. In 
order for the m-procedure to work, these answers must be represented by numbers indicating the 
appropriate columns in matrices A and B. Thus, in the example under consideration, each Y must 
be translated into a 1, each N into a 2, and each U into a 3. The task is not particularly difficult, 
but it is much easier to have MATLAB make the translation as well as do the calculations. The 
following two-stage approach for solving the problem works well. 

The first m-procedure oddsdf sets up the frequency information. The next m-procedure odds 
calculates the odds for a given profile. The advantage of splitting into two m-procedures is that we 
can set up the data once, then call repeatedly for the calculations for different profiles. As always, 
it is necessary to have the data in an appropriate form. The following is an example in which the 
data are entered in terms of actual frequencies of response. 

'/„~file~oddsf4.m 
°/,~Frequency~data~f or "classification 



'[42~22~5;~34~27~8 
'~41~13~15;~9~52~8 
'[20~31~5;~16~37~3 
~14~37~5;~31~17~8 



"15~45~9;~19~44~6;~22~43~4; 
'40~26~3;~48~12~9;~20~37~12] ; 
~33~19~4;~31~18~7;~23~28~5; 
"13~38~5;~27~24~5;~35~16~5] ; 



disp ( ' Call~f or~oddsdf ' ) 

Example 5.12: Classification using frequency data 

oddsf4 °/.~Call~for~data~in~file~oddsf4.m 

Call~f or "oddsdf "/."Prompt "built ~into~data~f ile 

oddsdf °/.~Call~f or~m-procedure~oddsdf 

Enter~matrix~A~of ~frequencies~f or~calibration~group~l~~A 
Enter~matrix~B~of ~frequencies~f or~calibration~group~2~~B 
Number "of "quest ions~=~ 10 
Answer s~per "quest ion~=~3 

~Enter~code~for~answers~and~call~f or "procedure" "odds" 
y~=~l; °/,~Use~of ~lower~case~f or~easier~writing 

n~=~2; 
u~=~3; 

odds 7,~Call~f or~calculating~procedure 

Enter~prof ile~matrix~E~~ [y~n~y~n~y~u~n~u~y~u] °/,~First "profile 
0dds~f avoring~Group~l : 5 . 845 
Classify" in~Group~l 

odds °/.~Second~call~f or~calculating~procedure 

Enter~prof ile~matrix~E~~ [n~n~u~n~y~y~u~n~n~y] °/.~Second~prof ile 
0dds~f avoring~Group~l : . 2383 
Classify" in~Group~2 
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odds y,~Third~call~f or~calculating~procedure 

Enter~prof ile~matrix~E~~ [y~y~n~y~u~u~n~n~y~y] 7,~Third~prof ile 
Odds~f avoring~Group~l : 5 . 05 
Classify" in~Group~l 

The principal feature of the m-procedure odds is the scheme for selecting the numbers from the A 
and B matrices. If E = [yynyuunnyy], then the coding translates this into the actual numerical 
matrix 

[1121332211] used internally. Then A(:,E) is a matrix with columns corresponding to 
elements of E. Thus 



e~=~A(: 


,E) 
















e~=~~~42~~~ 


~42~~ 


~~22~~ 


~~42~~ 


~~~5~~ 


~~~5~~ 


~~22~~ 


~~22~~ 


~~42 42 


34~~~ 


~34~~ 


~~27~~ 


~~34~~ 


~~~8~~ 


~~~8~~ 


~~27~~ 


~~27~~ 


~~34 34 


15~~~ 


~15~~ 


~~ 45 ~~ 


~~15~~ 


~~~9~~ 


~~~9~~ 


~~ 45 ~~ 


~~ 45 ~~ 


~~15 15 


19~~~ 


~19~~ 


~~ 44 ~~ 


~~19~~ 


~~~6~~ 


~~~6~~ 


~~ 44 ~~ 


~~ 44 ~~ 


~~19 19 


22~~~ 


~22~~ 


~~43~~ 


~~22~~ 


~~~4~~ 


~~~4~~ 


~~43~~ 


~~43~~ 


~~22 22 


41~~~ 


~ 41 ~~ 


~~13~~ 


~~41~~ 


~~15~~ 


~~15~~ 


~~13~~ 


~~13~~ 


~~41 41 


9~~~ 


~~9~~ 


~~52~~ 


~~~9~~ 


~~~8~~ 


~~~8~~ 


~~52~~ 


~~52~~ 


~~~9 9 


40~~~ 


~ 40 ~~ 


~~26~~ 


~~ 40 ~~ 


~~~3~~ 


~~~3~~ 


~~26~~ 


~~26~~ 


~~40 40 


48~~~ 


~ 48 ~~ 


~~12~~ 


~~ 48 ~~ 


~~~9~~ 


~~~9~~ 


~~12~~ 


~~12~~ 


~~48 48 


20~~~ 


~20~~ 


~~37~~ 


~~20~~ 


~~12~~ 


~~12~~ 


~~37~~ 


~~37~~ 


~~20 20 



The ith entry on the ith column is the count corresponding to the answer to the ith question. For 
example, the answer to the third question is N (no), and the corresponding count is the third entry 
in the N (second) column of A. The element on the diagonal in the third column of A(:,E) is 
the third element in that column, and hence the desired third entry of the N column. By picking 
out the elements on the diagonal by the command diag(A(:,E)), we have the desired set of counts 
corresponding to the profile. The same is true for diag(B(:,E)). 

Sometimes the data are given in terms of conditional probabilities and probabilities. A slight 
modification of the procedure handles this case. For purposes of comparison, we convert the problem 
above to this form by converting the counts in matrices A and B to conditional probabilities. We do 
this by dividing by the total count in each group (69 and 56 in this case). Also, P (Gi) = 69/125 = 
0.552 and P (G 2 ) = 56/125 = 0.448. 



GROUP 1 


P(G 1 ) = 


69/125 


GROUP 2 P(G 2 ) = 56/125 


Q 


Yes 


No 


Unc. 


Yes 


No 


Unc. 


l 


0.6087 


0.3188 


0.0725 


0.3571 


0.5536 


0.0893 


2 


0.4928 


0.3913 


0.1159 


0.2857 


0.6607 


0.0536 


3 


0.2174 


0.6522 


0.1304 


0.5893 


0.3393 


0.0714 


4 


0.2754 


0.6376 


0.0870 


0.5536 


0.3214 


0.1250 


5 


0.3188 


0.6232 


0.0580 


0.4107 


0.5000 


0.0893 


6 


0.5942 


0.1884 


0.2174 


0.2500 


0.6607 


0.0893 


7 


0.1304 


0.7536 


0.1160 


0.5536 


0.3036 


0.1428 


8 


0.5797 


0.3768 


0.0435 


0.2321 


0.6786 


0.0893 


9 


0.6957 


0.1739 


0.1304 


0.4821 


0.4286 


0.0893 


10 


0.2899 


0.5362 


0.1739 


0.6250 


0.2857 


0.0893 



123 



Table 5.6 

These data are in an m-file oddsp4.m. The modified setup m-procedure oddsdp uses the condi- 
tional probabilities, then calls for the m-procedure odds. 

Example 5.13: Calculation using conditional probability data 

oddsp4 %~Call~f or "convert ed~data~ (probabilities) 

oddsdp y„~Setup~m-procedure~f or "probabilities 

Enter" conditional~probabilities~f or~Group~l~~A 
Enter" conditional~probabilities~f or~Group~2~~B 
Probability~pl~individual~is~f rom~Group~l~~0 . 552 
"Number ~of "quest ions~=~ 10 
~ Answer s~per "quest ion~=~3 
"Enter ~code~for~answers~and~call~f or "procedure" "odds" 

y"--i; 

n~=~2; 

u~=~3; 

odds 

Enter~prof ile~matrix~E~~ [y~n~y~n~y~u~n~u~y~u] 

0dds~f avoring~Group~l : ~~5 . 845 

Classify" in~Group~l 

The slight discrepancy in the odds favoring Group 1 (5.8454 compared with 5.8452) can be at- 
tributed to rounding of the conditional probabilities to four places. The presentation above rounds 
the results to 5.845 in each case, so the discrepancy is not apparent. This is quite acceptable, since 
the discrepancy has no effect on the results. 



5.3 Problems on Conditional Independence 3 

Exercise 5.1 (Solution on p. 129.) 

Suppose {A, B} ci \C and {A, B} ci \C C , P (C) = 0.7, and 

P (A\C) = 0.4 P (B\C) = 0.6 P {A\C C ) = 0.3 P {B\C C ) = 0.2 (5.47) 

Show whether or not the pair {^4, B} is independent. 

Exercise 5.2 (Solution on p. 129.) 

Suppose {Ai, A 2 ,A 3 } ci \C and ci|C c , with P (C) = 0.4, and 

P(A t \C) = 0.90, 0.85, 0.80 P (A t \C c ) = 0.20, 0.15,0.20 for i = 1, 2,3, respectively (5.48) 

Determine the posterior odds P {C\A X A C 2 A Z ) /P (C c |^i^A 3 ). 

Exercise 5.3 (Solution on p. 129.) 

Five world class sprinters are entered in a 200 meter dash. Each has a good chance to break 
the current track record. There is a thirty percent chance a late cold front will move in, bringing 
conditions that adversely affect the runners. Otherwise, conditions are expected to be favorable for 
an outstanding race. Their respective probabilities of breaking the record are: 

• Good weather (no front): 0.75, 0.80, 0.65, 0.70, 0.85 



3 This content is available online at <http://cnx.Org/content/m24205/l.4/>. 
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• Poor weather (front in): 0.60, 0.65, 0.50, 0.55, 0.70 

The performances are (conditionally) independent, given good weather, and also, given poor 
weather. What is the probability that three or more will break the track record? 

Hint. If B 3 is the event of three or more, P (B 3 ) = P (B 3 \W) P (W) + P (B 3 \W C ) P {W c ). 

Exercise 5.4 (Solution on p. 129.) 

A device has five sensors connected to an alarm system. The alarm is given if three or more of 
the sensors trigger a switch. If a dangerous condition is present, each of the switches has high (but 
not unit) probability of activating; if the dangerous condition does not exist, each of the switches 
has low (but not zero) probability of activating (falsely). Suppose D = the event of the dangerous 
condition and A = the event the alarm is activated. Proper operation consists of AD \/ A C D C . 
Suppose Ei = the event the ith unit is activated. Since the switches operate independently, we 
suppose 

{E 1 , E 2 , E 3 , E 4 , E 5 }c\\D and ci \D C (5.49) 

Assume the conditional probabilities of the Ej, given D, are 0.91, 0.93, 0.96, 0.87, 0.97, and given 
D c , are 0.03, 0.02, 0.07, 0.04, 0.01, respectively. If P (D) = 0.02, what is the probability the alarm 
system acts properly? Suggestion. Use the conditional independence and the procedure ckn. 

Exercise 5.5 (Solution on p. 129.) 

Seven students plan to complete a term paper over the Thanksgiving recess. They work indepen- 
dently; however, the likelihood of completion depends upon the weather. If the weather is very 
pleasant, they are more likely to engage in outdoor activities and put off work on the paper. Let E; 
be the event the ith student completes his or her paper, At be the event that k or more complete 
during the recess, and W be the event the weather is highly conducive to outdoor activity. It is 
reasonable to suppose {Ei : 1 < i < 7}ci \W and ci \W C . Suppose 

P(Ei\W) = 0.4, 0.5, 0.3, 0.7, 0.5, 0.6, 0.2 (5.50) 

P(E l \W c ) = 0.7, 0.8, 0.5, 0.9, 0.7, 0.8, 0.5 (5.51) 

respectively, and P (W) = 0.8. Determine the probability P (Aj) that four our more complete 
their papers and P (A$) that five or more finish. 

Exercise 5.6 (Solution on p. 129.) 

A manufacturer claims to have improved the reliability of his product. Formerly, the product had 
probability 0.65 of operating 1000 hours without failure. The manufacturer claims this probability 
is now 0.80. A sample of size 20 is tested. Determine the odds favoring the new probability for 
various numbers of surviving units under the assumption the prior odds are 1 to 1. How many 
survivors would be required to make the claim creditable? 

Exercise 5.7 (Solution on p. 130.) 

A real estate agent in a neighborhood heavily populated by affluent professional persons is working 
with a customer. The agent is trying to assess the likelihood the customer will actually buy. His 
experience indicates the following: if H is the event the customer buys, S is the event the customer 
is a professional with good income, and E is the event the customer drives a prestigious car, then 

P(S)= 0.7 P(S\H) =0.90 P{S\H C ) = 0.2 P {E\S) = 0.95 P {E\S C ) = 0.25 (5.52) 

Since buying a house and owning a prestigious car are not related for a given owner, it seems 
reasonable to suppose P(E\HS) = P{E\H C S) and P{E\HS C ) = P{E\H C S C ). The customer 
drives a Cadillac. What are the odds he will buy a house? 
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Exercise 5.8 (Solution on p. 130.) 

In deciding whether or not to drill an oil well in a certain location, a company undertakes a 
geophysical survey. On the basis of past experience, the decision makers feel the odds are about 
four to one favoring success. Various other probabilities can be assigned on the basis of past 
experience. Let 

• H be the event that a well would be successful 

• S be the event the geological conditions are favorable 

• E be the event the results of the geophysical survey are positive 

The initial, or prior, odds are P (H) jP (H c ) = 4. Previous experience indicates 

P (S\H) = 0.9 P {S\H C ) = 0.20 P (E\S) = 0.95 P {E\S C ) = 0.10 (5.53) 

Make reasonable assumptions based on the fact that the result of the geophysical survey depends 
upon the geological formations and not on the presence or absence of oil. The result of the survey 
is favorable. Determine the posterior odds P (H\E) jP (H C \E). 

Exercise 5.9 (Solution on p. 130.) 

A software firm is planning to deliver a custom package. Past experience indicates the odds are at 
least four to one that it will pass customer acceptance tests. As a check, the program is subjected 
to two different benchmark runs. Both are successful. Given the following data, what are the odds 
favoring successful operation in practice? Let 

• H be the event the performance is satisfactory 

• S be the event the system satisfies customer acceptance tests 

• Ej be the event the first benchmark tests are satisfactory. 

• E 2 be the event the second benchmark test is ok. 

Under the usual conditions, we may assume {H, E\, E 2 } ci \S and ci \S C . Reliability data show 

P (H\S) = 0.95, P (H\S C ) = 0.45 (5.54) 

P (Ex\S) = 0.90 P (£i|S c ) = 0.25 P {E 2 \S) = 0.95 P (E 2 \S C ) = 0.20 (5.55) 

Determine the posterior odds P (H\E 1 E 2 ) /P {H C \E 1 E 2 ). 
Exercise 5.10 (Solution on p. 130.) 

A research group is contemplating purchase of a new software package to perform some specialized 
calculations. The systems manager decides to do two sets of diagnostic tests for significant bugs that 
might hamper operation in the intended application. The tests are carried out in an operationally 
independent manner. The following analysis of the results is made. 

• H = the event the program is satisfactory for the intended application 

• S = the event the program is free of significant bugs 

• E\ = the event the first diagnostic tests are satisfactory 

• E 2 = the event the second diagnostic tests are satisfactory 

Since the tests are for the presence of bugs, and are operationally independent, it seems reasonable to 
assume {H, E\, E 2 } ci \S and {H, E\, E 2 } ci \S C . Because of the reliability of the software company, 
the manager thinks P (S) = 0.85. Also, experience suggests 



P(H\S) -- 


= 0.95 


P(Ei\S) 


= 0.90 


P(E 2 \S) -- 


= 0.95 


P{H\S C )~- 


= 0.30 


pms*) 


= 0.20 


P(E 2 \S C )~- 


= 0.25 
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Table 5.7 

Determine the posterior odds favoring H if results of both diagnostic tests are satisfactory. 

Exercise 5.11 (Solution on p. 131.) 

A company is considering a new product now undergoing field testing. Let 

• H be the event the product is introduced and successful 

• S be the event the R&D group produces a product with the desired characteristics. 

• E be the event the testing program indicates the product is satisfactory 

The company assumes P (S) = 0.9 and the conditional probabilities 

P (H\S) = 0.90 P {H\S C ) = 0.10 P (E\S) = 0.95 P {E\S C ) = 0.15 (5.56) 

Since the testing of the merchandise is not affected by market success or failure, it seems reasonable 
to suppose {H, E} ci \S and ci \S C . The field tests are favorable. Determine P (H\E) /P {H C \E). 

Exercise 5.12 (Solution on p. 131.) 

Martha is wondering if she will get a five percent annual raise at the end of the fiscal year. She 
understands this is more likely if the company's net profits increase by ten percent or more. These 
will be influenced by company sales volume. Let 

• H = the event she will get the raise 

• S = the event company profits increase by ten percent or more 

• E = the event sales volume is up by fifteen percent or more 

Since the prospect of a raise depends upon profits, not directly on sales, she supposes {H, E}ci \S 
and {H, E} ci \S C . She thinks the prior odds favoring suitable profit increase is about three to one. 
Also, it seems reasonable to suppose 

P (H\S) = 0.80 P {H\S C ) = 0.10 P (E\S) = 0.95 P {E\S C ) = 0.10 (5.57) 

End of the year records show that sales increased by eighteen percent. What is the probability 
Martha will get her raise? 

Exercise 5.13 (Solution on p. 131.) 

A physician thinks the odds are about 2 to 1 that a patient has a certain disease. He seeks the 
"independent" advice of three specialists. Let H be the event the disease is present, and A, B, C be 
the events the respective consultants agree this is the case. The physician decides to go with the 
majority. Since the advisers act in an operationally independent manner, it seems reasonable to 
suppose {A, B, C}ci\H and ci\H c . Experience indicates 

P (A\H) = 0.8, P (B\H) = 0.7, P (C\H) = 0.75 (5.58) 

P{A C \H C )= 0.85, P{B C \H C ) = 0.8, P {C C \H C ) = 0.7 (5.59) 

What is the probability of the right decision (i.e., he treats the disease if two or more think it is 
present, and does not if two or more think the disease is not present)? 

Exercise 5.14 (Solution on p. 131.) 

A software company has developed a new computer game designed to appeal to teenagers and 
young adults. It is felt that there is good probability it will appeal to college students, and that if 
it appeals to college students it will appeal to a general youth market. To check the likelihood of 
appeal to college students, it is decided to test first by a sales campaign at Rice and University of 
Texas, Austin. The following analysis of the situation is made. 
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• H = the event the sales to the general market will be good 

• S = the event the game appeals to college students 

• Ei = the event the sales are good at Rice 

• E 2 = the event the sales are good at UT, Austin 

Since the tests are for the reception are at two separate universities and are operationally inde- 
pendent, it seems reasonable to assume {H, E\, E 2 } c\\S and {H, E\, E 2 }c\\S c . Because of its 
previous experience in game sales, the managers think P (S) = 0.80. Also, experience suggests 



P(H\S) 


= 0.95 


P(Ei\S) 


= 0.90 


P(E 2 \S) -- 


= 0.95 


P{H\S C )-- 


= 0.30 


P(E!\S C ) 


= 0.20 


P(E 2 \S C )-- 


= 0.25 



Table 5.8 

Determine the posterior odds favoring H if sales results are satisfactory at both schools. 

Exercise 5.15 (Solution on p. 131.) 

In a region in the Gulf Coast area, oil deposits are highly likely to be associated with underground 
salt domes. If H is the event that an oil deposit is present in an area, and S is the event of a salt 
dome in the area, experience indicates P (S\H) = 0.9 and P (S\H C ) = 0.1. Company executives 
believe the odds favoring oil in the area is at least 1 in 10. It decides to conduct two independent 
geophysical surveys for the presence of a salt dome. Let E 1: E 2 be the events the surveys indicate 
a salt dome. Because the surveys are tests for the geological structure, not the presence of oil, and 
the tests are carried out in an operationally independent manner, it seems reasonable to assume 
{H, Ei, E 2 } ci \S and ci \S C . Data on the reliability of the surveys yield the following probabilities 

P (Ei\S) = 0.95 P {Ei\S c ) = 0.05 P (E 2 \S) = 0.90 P {E 2 \S C ) = 0.10 (5.60) 

Determine the posterior odds p( v g J |£ [ ^ \ . Should the well be drilled? 

Exercise 5.16 (Solution on p. 131.) 

A sample of 150 subjects is taken from a population which has two subgroups. The subgroup 
membership of each subject in the sample is known. Each individual is asked a battery of ten 
questions designed to be independent, in the sense that the answer to any one is not affected by 
the answer to any other. The subjects answer independently. Data on the results are summarized 
in the following table: 



GROUP 1 (84 members) 


GROUP 2 (66 members) 


Q 


Yes 


No 


Unc 


Yes 


No 


Unc 


1 


51 


26 


7 


27 


34 


5 


2 


42 


32 


10 


19 


43 


4 


3 


19 


54 


11 


39 


22 


5 


4 


24 


53 


7 


38 


19 


9 


5 


27 


52 


5 


28 


33 


5 


6 


49 


19 


16 


19 


41 


6 


7 


16 


59 


9 


37 


21 


8 


8 


47 


32 


5 


19 


42 


5 


9 


55 


17 


12 


27 


33 


6 


10 


24 


53 


7 


39 


21 


6 
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Table 5.9 

Assume the data represent the general population consisting of these two groups, so that the 
data may be used to calculate probabilities and conditional probabilities. 

Several persons are interviewed. The result of each interview is a "profile" of answers to the 
questions. The goal is to classify the person in one of the two subgroups 

For the following profiles, classify each individual in one of the subgroups 

i- y, n, y, n, y, u, n, u, y. u 
ii. n, n, u, n, y, y, u, n, n, y 
iii- y, y, n, y, u, u, n, n, y, y 

Exercise 5.17 (Solution on p. 132.) 

The data of Exercise 5.16, above, are converted to conditional probabilities and probabilities, as 
follows (probabilities are rounded to two decimal places). 



GROUP 1 P(Gi) =0.56 


GROUP 2 P{G 2 ) = 0.44 


Q 


Yes 


No 


Unc 


Yes 


No 


Unc 


1 


0.61 


0.31 


0.08 


0.41 


0.51 


0.08 


2 


0.50 


0.38 


0.12 


0.29 


0.65 


0.06 


3 


0.23 


0.64 


0.13 


0.59 


0.33 


0.08 


4 


0.29 


0.63 


0.08 


0.57 


0.29 


0.14 


5 


0.32 


0.62 


0.06 


0.42 


0.50 


0.08 


6 


0.58 


0.23 


0.19 


0.29 


0.62 


0.09 


7 


0.19 


0.70 


0.11 


0.56 


0.32 


0.12 


8 


0.56 


0.38 


0.06 


0.29 


0.63 


0.08 


9 


0.65 


0.20 


0.15 


0.41 


0.50 


0.09 


10 


0.29 


0.63 


0.08 


0.59 


0.32 


0.09 



Table 5.10 

For the following profiles classify each individual in one of the subgroups. 



i- y, n, y, n, y, u, n, u, y, u 
ii. n, n, u, n, y, y, u, n, n, y 
iii- y, y, n, y, u, u, n, n, y, y 
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Solutions to Exercises in Chapter 5 

Solution to Exercise 5.1 (p. 123) 

P(A) = P (A\C) P{C) + P {A\C C ) P (C c ) , P(B) = P (B\C) P (C) + P {B\C C ) P (C c ), and P (AB) 
P (A\C) P (B\C) P{C) + P {A\C C ) P {B\C C ) P (C c ). 

PA = 0.4*0.7 + 0.3*0.3 
PA = 0.3700 
PB = 0.6*0.7 + 0.2*0.3 
PB = 0.4800 
PA*PB 

axis = 0.1776 

PAB = 0.4*0.6*0.7 + 0.3*0.2*0.3 
PAB = 0.1860 •/. PAB not equal PA*PB; not independent 

Solution to Exercise 5.2 (p. 123) 

P{C\A X AIA 3 ) ___ P(C) P (A^C) P (A%\C) P (A 3 \C) 



• 



P {C^A^As) P{C C ) P (j4i|C c ) P {A c 2 \C c ) P (A 3 \C c ) 
0.4 0.9 • 0.15*0.80 108 



(5.61) 



2.12 (5.62) 



0.6 0.20 • 0.85 • 0.20 51 
Solution to Exercise 5.3 (p. 123) 

PW = 0.01* [75 80 65 70 85]; 
PWc = 0.01* [60 65 50 55 70]; 
P = ckn(PW,3)*0.7 + ckn(PWc,3) *0 . 3 
P = 0.8353 

Solution to Exercise 5.4 (p. 124) 

PI = 0.01*[91 93 96 87 97] ; 
P2 = 0.01* [3 2 7 4 1] ; 

P = ckn(Pl,3)*0.02 + (1 - ckn(P2,3) ) *0 . 98 
P = 0.9997 

Solution to Exercise 5.5 (p. 124) 

PW = 0.1* [4 5 3 7 5 6 2] ; 
PWc = 0.1* [7 8 5 9 7 8 5] ; 
PA4 = ckn(PW,4)*0.8 + ckn(PWc,4) *0. 2 
PA4 = 0.4993 

PA5 = ckn(PW,5)*0.8 + ckn(PWc,5) *0. 2 
PA5 = 0.2482 

Solution to Exercise 5.6 (p. 124) 

Let Ej be the event the probability is 0.80 and E2 be the event the probability is 0.65. Assume 
P(E 1 )/P(E 2 ) = 1. 

P(E 1 \S n = k) = PjEj) P(S n = k\E 1 ) 

P{E 2 \S n = k) P(E 2 )' P(S n = k\E 2 ) ( - > 
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k = 1:20; 
odds = ibinom(20,0.80,k) ./ibinom(20,0.65,k); 
disp([k;odds] ') 



7. Need at least 15 or 16 successes 



13.0000 


0.2958 


14.0000 


0.6372 


15.0000 


1.3723 


16.0000 


2.9558 


17.0000 


6.3663 


18.0000 


13.7121 


19.0000 


29.5337 


20.0000 


63.6111 



Solution to Exercise 5.7 (p. 124) 

Assumptions amount to {H, E} ci \S and ci \S C 



P(H\S) P(H)P{S\H) 



Solution to Exercise 5.8 (p. 125) 



P{H\E) P(H) P (S\H) P (E\S) + P (S C \H) P (E\S C 



P(H C \E) P{H C ) P{S\H C )P{E\S) + P{S C \H C )P{E\S C 



Solution to Exercise 5.9 (p. 125) 

P{H\E 1 E 2 ) P{HE 1 E 2 S) + P{HE 1 E 2 S C 



P(H C \E 1 E 2 ) P{H C E 1 E 2 S) + P{H C E 1 E 2 S C 



P{S)P{H\S)P(E 1 \S)P(E2\S)+P(S C )P(H\S C )P(E 1 \S C )P{E 2 \S C ) 
P(S)P(H^S)pIe 1 \S)pIe 2 \S)+P(S c )P(H^\S c )P(E 1 \S c )P(E2\S c ) 



(5.64) 



P{H C \S) P{H C )P{S\H C ) 
P(S) = P{H)P{S\H) + [l-P{H)}P{S\H c ) which implies (5.65) 

r,,™ P(S)-P(S\H C ) r/rT , P (# IS) 5 0.9 45 

P (ff) = — 7-V^ ' , \ = 5/7 so that — ) — V4 = - • — = — (5-66) 

V ' P (Sff) - P (Sff c ) ' P(H C \S) 2 0.2 4 y ' 



(5.67) 



0.90 • 0.95 + 0.10 • 0.10 

4 • = 12.8148 5.68 

0.20 • 0.95 + 0.80 • 0.10 v ; 



(5.69) 
(5.70) 



__ 0.80»0.95»0.90»0.95+0.20»0.45»0.25»0.20 _ -. n ^aq-, -, (<\7-\\ 

0.80»0.05»0.90»0.95+0.20»0.55»0.25»0.20 ° ,U ° l -' 1 ^ 

Solution to Exercise 5.10 (p. 125) 

PJME1E2) P (HE1E2S) + P (HEiEjS") 

P(H C \E 1 E 2 ) P{H C E 1 E 2 S) + P{H C E 1 E 2 S C ) [ ' ' 

P {HE^S) = P{S)P (H\S) P (^i|Sff) P {E 2 \SHE{) = P{S)P (ff|S) P (E^S) P {E 2 \S) (5.73) 

with similar expressions for the other terms. 

P (H I £1 £ 2 ) 0.85 • 0.95 • 0.90 • 0.95 + 0.15 • 0.30 • 0.25 • 0.20 



P (H C \E 1 E 2 ) 0.85 • 0.05 • 0.90 • 0.95 + 0.15 • 0.70 • 0.25 * 0.20 



16.6555 (5.74) 



Solution to Exercise 5.11 (p. 126) 

P {H\E) P (S) P (H\S) P (E\S) + P {S c ) P {H\S C ) P (E\S C ) 



P {H C \E) P (5) P {H C \S) P (E\S) + P (S c ) P {H C \S C ) P (E\S 
0.90 • 0.90 • 0.95 + 0.10 • 0.10 • 0.15 



0.90 • 0.10 • 0.95 + 0.10 • 0.90 • 0.15 
Solution to Exercise 5.12 (p. 126) 



P(H\E) P (S) P (H\S) P (E\S) + P (S c ) P (H\S C ) P (E\S C 
P(H C \E) ~ P (S) P (H°\S) P (E\S) + P {S c ) P {H C \S C ) P (E\S 

0.75 • 0.80 • 0.95 + 0.25 • 0.10 • 0.10 



0.75 • 0.20 • 0.95 + 0.25 • 0.90 • 0.10 
Solution to Exercise 5.13 (p. 126) 

PH = 0.01* [80 70 75]; 
PHc = 0.01* [85 80 70] ; 
pH = 2/3; 

P = ckn(PH,2)*pH + ckn(PHc,2) * (1 - pH) 
P = 0.8577 

Solution to Exercise 5.14 (p. 126) 

P{H\E 1 E 2 ) P{HE 1 E 2 S) + P{HE 1 E 2 S C 



P (H C \E 1 E 2 ) P (ife-Ei^S) + P {H^xE-iS") 



P{S)P{H\S)P{E 1 \S)P{E 2 \S)+P(S C )P(H\S C )P{E 1 \S C )P(E 2 \S C ) 
P(S)P(H"\S)P{E-i\S)P{E 2 \S)+P{S c )P(H^S c )P{E 1 \S c )P(E 2 \S c ) 



0.80»0.05»0.90»0.95+0.20»0.70»0.20»0.25 
Solution to Exercise 5.15 (p. 127) 

P(H\E 1 E 2 ) P{HE 1 E 2 S) + P{HE 1 E 2 S C ) 



• 



P(i7 c |£i£ 2 ) 10 0.1 • 0.95 • 0.90 + 0.90 •0.05*0.10 
Solution to Exercise 5.16 (p. 127) 

'/. file npr05_16.m (Section~17 .8 . 25: mpr05_16) 
'/, Data for Exercise~5 . 16 

A = [51 26 7; 42 32 10; 19 54 11; 24 53 7; 27 52 5; 

49 19 16; 16 59 9; 47 32 5; 55 17 12; 24 53 7]; 

B = [27 34 5; 19 43 4; 39 22 5; 38 19 9; 28 33 5; 
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(5.75) 



7.7879 (5.76) 



(5.77) 



3.4697 (5.78) 



(5.79) 
(5.80) 



0.80>0.95>0.90>0.95+0.20>0.30»0.20>0.25 _ -r r OAA7 (^811 

n sn.n n?i«n nn.n QR-i-n s>n«n 7n«n 9n«n 2K id- °* 1 ' \°- ° L ) 



(5.82) 



PiH^E^) P{H C E 1 E 2 S) + P{H C E 1 E 2 S C 

P {HE 1 E 2 S) = P{H)P (S\H) P (E^SH) P {E 2 \SHE X ) = P (H) P (S\H) P (E^S) P {E 2 \S) (5.83) 
with similar expressions for the other terms. 

P(H\ExE 2 ) 1 0.9 • 0.95 • 0.90 + 0.10 • 0.05 • 0.10 



0.8556 (5.84) 



[y n y n 


y 


u 


n 


u 


y 


u] 


3.743 














[n n u n 


y 


y 


u 


n 


n 


y] 


0.2693 














[y y n y 
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u 


n 


n 


y 


y] 


5.286 
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19 41 6; 37 21 8; 19 42 5; 27 33 6; 39 21 6]; 
dispCCall for oddsdf ' ) 
npr05_16 (Section~17 . 8 . 25 : mpr05_16) 
Call for oddsdf 
oddsdf 

Enter matrix A of frequencies for calibration group 1 A 
Enter matrix B of frequencies for calibration group 2 B 
Number of questions = 10 
Answers per question = 3 
Enter code for answers and call for procedure "odds" 

y = i; 

n = 2; 

u = 3; 

odds 

Enter profile matrix E 

Odds favoring Group 1 : 

Classify in Group 1 

odds 

Enter profile matrix E 

Odds favoring Group 1 : 

Classify in Group 2 

odds 

Enter profile matrix E 

Odds favoring Group 1 : 

Classify in Group 1 

Solution to Exercise 5.17 (p. 128) 

npr05_17 (Section~17. 8 . 26: npr05_17) 
'/. file npr05_17.m (Section~17. 8 . 26 : npr05_17) 
'/, Data for Exercise~5 . 17 
PG1 = 84/150; 
PG2 = 66/125; 
A = [0.61 0.31 0.08 

0.50 0.38 0.12 

0.23 0.64 0.13 

0.29 0.63 0.08 

0.32 0.62 0.06 

0.58 0.23 0.19 

0.19 0.70 0.11 

0.56 0.38 0.06 

0.65 0.20 0.15 

0.29 0.63 0.08] ; 

B = [0.41 0.51 0.08 

0.29 0.65 0.06 

0.59 0.33 0.08 

0.57 0.29 0.14 

0.42 0.50 0.08 

0.29 0.62 0.09 

0.56 0.32 0.12 

0.29 0.64 0.08 
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0.41 0.50 0.09 

0.59 0.32 0.09] ; 
dispCCall for oddsdp') 
Call for oddsdp 
oddsdp 

Enter matrix A of conditional probabilities for Group 1 A 
Enter matrix B of conditional probabilities for Group 2 B 
Probability pi an individual is from Group 1 PG1 
Number of questions = 10 
Answers per question = 3 
Enter code for answers and call for procedure "odds" 

y = l; 

n = 2; 

u = 3; 

odds 

Enter profile matrix E 

Odds favoring Group 1 : 

Classify in Group 1 

odds 

Enter profile matrix E 

Odds favoring Group 1 : 

Classify in Group 2 

odds 

Enter profile matrix E 

Odds favoring Group 1 : 

Classify in Group 1 



[y n y n 


y 


u 


n 


u 


y 


u] 


3.486 














[n n u n 
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y 


u 


n 


n 


y] 


0.2603 
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n 
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y] 


5.162 
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Chapter 6 

Random Variables and Probabilities 

6.1 Random Variables and Probabilities 1 

6.1.1 Introduction 

Probability associates with an event a number which indicates the likelihood of the occurrence of that event 
on any trial. An event is modeled as the set of those possible outcomes of an experiment which satisfy a 
property or proposition characterizing the event. 

Often, each outcome is characterized by a number. The experiment is performed. If the outcome is 
observed as a physical quantity, the size of that quantity (in prescribed units) is the entity actually observed. 
In many nonnumerical cases, it is convenient to assign a number to each outcome. For example, in a coin 
flipping experiment, a "head" may be represented by a 1 and a "tail" by a 0. In a Bernoulli trial, a success 
may be represented by a 1 and a failure by a 0. In a sequence of trials, we may be interested in the number 
of successes in a sequence of n component trials. One could assign a distinct number to each card in a deck 
of playing cards. Observations of the result of selecting a card could be recorded in terms of individual 
numbers. In each case, the associated number becomes a property of the outcome. 

6.1.2 Random variables as functions 

We consider in this chapter real random variables (i.e., real- valued random variables). In the chapter 
"Random Vectors and Joint Distributions" (Section 8.1), we extend the notion to vector-valued random 
quantites. The fundamental idea of a real random variable is the assignment of a real number to each 
elementary outcome u) in the basic space tt. Such an assignment amounts to determining a function X, 
whose domain is Q, and whose range is a subset of the real line R. Recall that a real-valued function on a 
domain (say an interval I on the real line) is characterized by the assignment of a real number y to each 
element x (argument) in the domain. For a real-valued function of a real variable, it is often possible to 
write a formula or otherwise state a rule describing the assignment of the value to each argument. Except 
in special cases, we cannot write a formula for a random variable X. However, random variables share some 
important general properties of functions which play an essential role in determining their usefulness. 

Mappings and inverse mappings 

There are various ways of characterizing a function. Probably the most useful for our purposes is as a 
mapping from the domain £1 to the codomain R. We find the mapping diagram of Figure 1 extremely useful 
in visualizing the essential patterns. Random variable X, as a mapping from basic space Q, to the real line 
R, assigns to each element u> a value t = X (u>). The object point u> is mapped, or carried, into the image 
point t. Each u> is mapped into exactly one t, although several u> may have the same image point. 



lr This content is available online at <http://cnx.Org/content/m23260/l.8/>. 
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t = X(co) 



Figure 6.1: The basic mapping diagram t = X (uj) 



Associated with a function X as a mapping are the inverse mapping X -1 and the inverse images it 
produces. Let M be a set of numbers on the real line. By the inverse image of M under the mapping X, we 
mean the set of all those lj £ ft. which are mapped into M by X (see Figure 2). If X does not take a value 
in M, the inverse image is the empty set (impossible event). If M includes the range of X, (the set of all 
possible values of X), the inverse image is the entire basic space ft. Formally we write 



X _1 (M) = {uj : X{lu) e M} 



(6.1) 



Now we assume the set X -1 (M), a subset of ft, is an event for each M. A detailed examination of that 
assertion is a topic in measure theory. Fortunately, the results of measure theory ensure that we may make 
the assumption for any X and any subset M of the real line likely to be encountered in practice. The set 
X -1 (M) is the event that X takes a value in M. As an event, it may be assigned a probability. 




Figure 6.2: E is the inverse image X 1 (M). 
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Example 6.1: Some illustrative examples. 

1. X = Ie where E is an event with probability p. Now X takes on only two values, and 1. 
The event that X take on the value 1 is the set 

{u : X(u) = 1} = X- 1 {{1}) =E (6.2) 

so that P ({uj : X (w) = 1}) = p. This rather ungainly notation is shortened to P (X = 1) = p. 
Similarly, P (X = 0) = I— p. Consider any set M. If neither 1 nor is in M, then X^ 1 (M) = 
If is in M, but 1 is not, then X~ l (M) = E c If 1 is in M, but is not, then X' 1 (M) = E 
If both 1 and are in M, then X~ x (M) = Q. In this case the class of all events X^ 1 (M) 
consists of event E, its complement E c , the impossible event 0, and the sure event Q.. 

2. Consider a sequence of n Bernoulli trials, with probability p of success. Let S n be the random 
variable whose value is the number of successes in the sequence of n component trials. Then, 
according to the analysis in the section "Bernoulli Trials and the Binomial Distribution" 
(Section 4.3.2: Bernoulli trials and the binomial distribution) 



P(S n = k) = C(n, k)p k (l-p) 7 



< k< n 



(6.3) 



Before considering further examples, we note a general property of inverse images. We state it in terms of a 
random variable, which maps Q. to the real line (see Figure 3). 

Preservation of set operations 

Let X be a mapping from O to the real line R. If M, M i} i € J, are sets of real numbers, with respective 
inverse images E, Ei, then 

X- 1 (M c ) = E c , X- 1 I |J Mi J =\jEk and X' 1 If] M t J = f| E t (6.4) 

\i<£J J ieJ \i£J J ieJ 

Examination of simple graphical examples exhibits the plausibility of these patterns. Formal proofs amount 
to careful reading of the notation. Central to the structure are the facts that each element u> is mapped 
into only one image point t and that the inverse image of M is the set of all those u> which are mapped into 
image points in M. 




Figure 6.3: Preservation of set operations by inverse images. 
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An easy, but important, consequence of the general patterns is that the inverse images of disjoint M,N 
are also disjoint. This implies that the inverse of a disjoint union of M; is a disjoint union of the separate 
inverse images. 

Example 6.2: Events determined by a random variable 

Consider, again, the random variable S n which counts the number of successes in a sequence of 
n Bernoulli trials. Let n = 10 and p = 0.33. Suppose we want to determine the probability 
P (2 < Sio < 8). Let Ak = {uj : S\q (w) = k}, which we usually shorten to A^ = {S\o = k}. Now 
the Ajj- form a partition, since we cannot have u € A^ and u> £ Aj, j ^ k (i.e., for any to, we 
cannot have two values for S n (u))). Now, 

{2 < S w < 8} = A 3 \JA 4 \/A 5 \/A 6 \/ A 7 \/A 8 (6.5) 

since Sio takes on a value greater than 2 but no greater than 8 iff it takes one of the integer values 
from 3 to 8. By the additivity of probability, 

8 

P (2 < Sio < 8) = J2 P ( 5 io = k) = 0.6927 (6.6) 

k=3 



6.1.3 Mass transfer and induced probability distribution 

Because of the abstract nature of the basic space and the class of events, we are limited in the kinds of 
calculations that can be performed meaningfully with the probabilities on the basic space. We represent 
probability as mass distributed on the basic space and visualize this with the aid of general Venn diagrams 
and minterm maps. We now think of the mapping from £7 to R as a producing a point-by-point transfer of 
the probability mass to the real line. This may be done as follows: 

To any set M on the real line assign probability mass Px (M) = P (X~ l (M)) 

It is apparent that Px (M ) > and Px (R) = P (£i) = 1. And because of the preservation of set 
operations by the inverse mapping 



P 



x 



' oo \ / / oo \ \ / oo \oo CO 

V Mi = p [x- 1 v mA = p \/ * _1 ( M ^) = E p ( X ^ ( M ^)) = E p * ( M «) ( 6 - 7 ) 

This means that Px has the properties of a probability measure defined on the subsets of the real line. 
Some results of measure theory show that this probability is defined uniquely on a class of subsets of R 
that includes any set normally encountered in applications. We have achieved a point-by-point transfer 
of the probability apparatus to the real line in such a manner that we can make calculations about the 
random variable X. We call Px the probability measure induced byX. Its importance lies in the fact that 
P {X g M) = Px (M). Thus, to determine the likelihood that random quantity X will take on a value in 
set M, we determine how much induced probability mass is in the set M. This transfer produces what is 
called the probability distribution for X. In the chapter "Distribution and Density Functions" (Section 7.1), 
we consider useful ways to describe the probability distribution induced by a random variable. We turn first 
to a special class of random variables. 

6.1.4 Simple random variables 

We consider, in some detail, random variables which have only a finite set of possible values. These are called 
simple random variables. Thus the term "simple" is used in a special, technical sense. The importance of 
simple random variables rests on two facts. For one thing, in practice we can distinguish only a finite set of 
possible values for any random variable. In addition, any random variable may be approximated as closely 
as pleased by a simple random variable. When the structure and properties of simple random variables have 
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been examined, we turn to more general cases. Many properties of simple random variables extend to the 
general case via the approximation procedure. 

Representation with the aid of indicator functions 

In order to deal with simple random variables clearly and precisely, we must find suitable ways to express 
them analytically. We do this with the aid of indicator functions (Section 1.3.4: The indicator function). 
Three basic forms of representation are encountered. These are not mutually exclusive representatons. 

1. Standard or canonical form, which displays the possible values and the corresponding events. If X 
takes on distinct values 

{ti, t 2 , • • • , t n } with respective probabilities {pi, p 2 , ■ ■ ■ , p n } (6-8) 

and if Ai = {X = ti}, for 1 < i < n, then {A\, A 2 , ■ ■ ■ , A n } is a partition (i.e., on any trial, exactly 
one of these events occurs) . We call this the partition determined by (or, generated by) X. We may 
write 

n 
X = hI Al + t 2 I A2 + • • • + t n I An = Yji lA i ( 6 - 9 ) 

8 = 1 

If X (lo) = ti, then uj s Ai, so that I A . (oj) = 1 and all the other indicator functions have value zero. 
The summation expression thus picks out the correct value t;. This is true for any <;,-, so the expression 
represents X (ui) for all u>. The distinct set {t\, t 2 , ■■■ , t n } of the values and the corresponding 
probabilities {pi, p 2 , ■■■ , p n } constitute the distribution for X. Probability calculations for X are 
made in terms of its distribution. One of the advantages of the canonical form is that it displays the 
range (set of values), and if the probabilities pi = P (Ai) are known, the distribution is determined. 
Note that in canonical form, if one of the t; has value zero, we include that term. For some probability 
distributions it may be that P (AA = for one or more of the t;. In that case, we call these values 
null values, for they can only occur with probability zero, and hence are practically impossible. In 
the general formulation, we include possible null values, since they do not affect any probabilitiy 
calculations. 

Example 6.3: Successes in Bernoulli trials 

As the analysis of Bernoulli trials and the binomial distribution shows (see Section 4.8), 
canonical form must be 



S n = J2 kI A k with P(A k ) = C(n,k)p k (l-p) nk , 0<k<n (6.10) 



For many purposes, both theoretical and practical, canonical form is desirable. For one thing, it 
displays directly the range (i.e., set of values) of the random variable. The distribution consists of the 
set of values {tk : 1 < k < n) paired with the corresponding set of probabilities {p k : 1 < k < n}, 
where p k = P (A k ) = P (X = t k ). 
2. Simple random variable X may be represented by a primitive form 

X = c\Ici + c 2 Ic 2 + ■ ■ ■ , c m Ic m , where {Cj : 1 < j < m} is a partition (6-11) 

Remarks 

• If {Cj : 1 < j < m) is a disjoint class, but U^Li Cj / ^> we ma Y append the event C m +i = 

U^=i Cj and assign value zero to it. 

• We say a primitive form, since the representation is not unique. Any of the C; may be partitioned, 
with the same value C; associated with each subset formed. 

• Canonical form is a special primitive form. Canonical form is unique, and in many ways normative. 

Example 6.4: Simple random variables in primitive form 
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• A wheel is spun yielding, on a equally likely basis, the integers 1 through 10. Let Cj be 
the event the wheel stops at i, 1 < i < 10. Each P (Cj) = 0.1. If the numbers 1, 4, or 7 
turn up, the player loses ten dollars; if the numbers 2, 5, or 8 turn up, the player gains 
nothing; if the numbers 3, 6, or 9 turn up, the player gains ten dollars; if the number 10 
turns up, the player loses one dollar. The random variable expressing the results may 
be expressed in primitive form as 

X = -10/ Cl + 0/ C2 + 107c, - 10/c 4 + 07 Cs + 107 Co - 107c 7 + 01c 8 + 107 Cg - I Cw (6.12) 

• A store has eight items for sale. The prices are $3.50, $5.00, $3.50, $7.50, $5.00, $5.00, 
$3.50, and $7.50, respectively. A customer comes in. She purchases one of the items with 
probabilities 0.10, 0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing 
the amount of her purchase may be written 

X = 3.5/ Cl + 5.07 C2 + 3.57 C;i + 7.57 C4 + 5.07 Cs + 5.07 C(i + 3.57 Cy + 7.5/c 8 (6.13) 

3. We commonly have X represented in affine form, in which the random variable is represented as an 
affine combination of indicator functions (i.e., a linear combination of the indicator functions plus a 
constant, which may be zero). 

m 

X = c + c 1 I El + c 2 Ie 2 + • • • + c m I Em = c + ^ c i l Ej (6-14) 

i=i 

In this form, the class {Ei, E%, ■ ■ ■ , E m } is not necessarily mutually exclusive, and the coefficients do 
not display directly the set of possible values. In fact, the E; often form an independent class. Remark. 
Any primitive form is a special affine form in which Co = and the E; form a partition. 

Example 6.5 

Consider, again, the random variable S n which counts the number of successes in a sequence 
of n Bernoulli trials. If E; is the event of a success on the ith trial, then one natural way to 
express the count is 

n 

S n =^T I Ei , with P(Ei)=p \<i<n (6.15) 

This is affine form, with cq = and Cj = 1 for 1 < i < n. In this case, the E; cannot form a 
mutually exclusive class, since they form an independent class. 

Events generated by a simple random variable: canonical form 

We may characterize the class of all inverse images formed by a simple random X in terms of the 
partition {Ai : 1 < i < n) it determines. Consider any set M of real numbers. If t; in the range of X 
is in M, then every point u> € Ai maps into t;, hence into M. If the set J is the set of indices i such 
that ti £ M, then 

Only those points to in Am = \J Ai map into M. 

ie.J 
Hence, the class of events (i.e., inverse images) determined by X consists of the impossible event 0, the 
sure event i7, and the union of any subclass of the A; in the partition determined by X. 

Example 6.6: Events determined by a simple random variable 

Suppose simple random variable X is represented in canonical form by 

X = -21 A -I B + 0I C + 3I D (6.16) 

Then the class {A, B, C, D} is the partition determined by X and the range of X is 
{-2,-1,0,3}. 
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a. If M is the interval [—2, 1], then the values -2, -1, and are in M and X 1 (M) = 
A\JB\JC. 

b. If M is the set (-2, -1] U [1, 5], then the values -1, 3 are in M and X^ 1 (M) = B\J D. 

c. The event {X < 1} = {X £ (-oo, 1]} = X" 1 (M), where M = (-oo, 1]. Since values 
-2, -1, are in M, the event {X <l} = A\J B\J C. 



6.1.5 Determination of the distribution 

Determining the partition generated by a simple random variable amounts to determining the canonical 
form. The distribution is then completed by determining the probabilities of each event Ak = {X = tk). 

From a primitive form 

Before writing down the general pattern, we consider an illustrative example. 

Example 6.7: The distribution from a primitive form 

Suppose one item is selected at random from a group of ten items. The values (in dollars) and 
respective probabilities are 



C J 


2.00 


1.50 


2.00 


2.50 


1.50 


1.50 


1.00 


2.50 


2.00 


1.50 


P(Cj) 


0.08 


0.11 


0.07 


0.15 


0.10 


0.09 


0.14 


0.08 


0.08 


0.10 



Table 6.1 

By inspection, we find four distinct values: t\ = 1.00, t 2 = 1.50, t 3 = 2.00, and t 4 = 2.50. The 
value 1.00 is taken on for u> e C 7 , so that A x = C 7 and P (A{) = P (C 7 ) = 0.14. Value 1.50 is 
taken on for u> G C2, C5, Cg, C10 so that 

A 2 = C 2 \J C h \J C 6 \J C10 and P (A 2 ) = P (C 2 ) + P (C 5 ) + P (C 6 ) + P (C 10 ) = 0.40 (6.17) 

Similarly 

P (A 3 ) = P (Ci) + P (C 3 ) + P (C 9 ) = 0.23 and P (A 4 ) = P (C 4 ) + P (C 8 ) = 0.23 (6.18) 

The distribution for X is thus 



k 


1.00 


1.50 


2.00 


2.50 


P{X = k) 


0.14 


0.40 


0.23 


0.23 



Table 6.2 

The general procedure may be formulated as follows: 

If X = 5^7=1 c j^c j' we identify the set of distinct values in the set {cj : 1 < j < m). Suppose these 
are t\ < ti < ■ ■ ■ < t n . For any possible value t; in the range, identify the index set J; of those j such that 
Ci = tj. Then the terms 



^ Cjlcj = U ^ Jp. = tjlAj , where A t = \J Cj, 

Ji Ji je.h 



and 



P(A l ) = P(X = t l )=Y,P(C 3 
jeJi 

Examination of this procedure shows that there are two phases: 



(6.19) 



(6.20) 
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• Select and sort the distinct values t\, t 2 , ■ ■ ■ , t n 

• Add all probabilities associated with each value t; to determine P (X = ti) 

We use the m-function csort which performs these two operations (see Example 4 (Example 2.9: The software 
survey (continued)) from "Minterms and MATLAB Calculations"). 

Example 6.8: Use of csort on Example 6.7 (The distribution from a primitive form) 

> C = [2.00 1.50 2.00 2.50 1.50 1.50 1.00 2.50 2.00 1.50]; 7, Matrix of c_j 

> pc = [0.08 0.11 0.07 0.15 0.10 0.09 0.14 0.08 0.08 0.10]; 7, Matrix of P(C_j) 
3> [X,PX] = csort(C,pc); 7, The sorting and consolidating operation 

> disp([X;PX] ') 7, Display of results 

1.0000 0.1400 

1.5000 0.4000 

2.0000 0.2300 

2.5000 0.2300 

For a problem this small, use of a tool such as csort is not really needed. But in many problems 
with large sets of data the m-function csort is very useful. 

From afflne form 

Suppose X is in afflne form, 

rn 

X = c + ciI El + c 2 Ie 2 + • ■ • + c m I Em = co + ^2, c 3 I E j (6-21) 

i=i 
We determine a particular primitive form by determining the value of X on each minterm generated by 
the class {Ej : 1 < j < m). We do this in a systematic way by utilizing minterm vectors and properties of 
indicator functions. 

Step 1. X is constant on each minterm generated by the class {E\, E 2 , • • • , E m } since, as noted in the treat- 
ment of the minterm expansion, each indicator function I Ei is constant on each minterm. We determine 
the value S; of X on each minterm M;. This describes X in a special primitive form 

2 m -l 

X = Y^ s^Mi, with P(Mi) =p h < i < 2 m - 1 (6.22) 

fe=0 

Step 2. We apply the csort operation to the matrices of values and minterm probabilities to determine the 
distribution for X. 

We illustrate with a simple example. Extension to the general case should be quite evident. First, we do the 
problem "by hand" in tabular form. Then we use the m-procedures to carry out the desired operations. 

Example 6.9: Finding the distribution from afflne form 

A mail order house is featuring three items (limit one of each kind per customer). Let 

Ei = the event the customer orders item 1, at a price of 10 dollars. 
Ei = the event the customer orders item 2, at a price of 18 dollars. 
£3 = the event the customer orders item 3, at a price of 10 dollars. 

There is a mailing charge of 3 dollars per order. 

We suppose {E\, E 2 , E3} is independent with probabilities 0.6, 0.3, 0.5, respectively. Let X be 
the amount a customer who orders the special items spends on them plus mailing cost. Then, in 
affine form, 

X = 10 I El + 18 Ie 2 + 10 Ie 3 + 3 (6.23) 
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We seek first the primitive form, using the minterm probabilities, which may calculated in this 
case by using the m-function minprob. 

1. To obtain the value of X on each minterm we 

• Multiply the minterm vector for each generating event by the coefficient for that event 

• Sum the values on each minterm and add the constant 

To complete the table, list the corresponding minterm probabilities. 



i 


10/s, 


18/ B2 


10/e, 


c 


s; 


prrii 














3 


3 


0.14 


1 








10 


3 


13 


0.14 


2 





18 





3 


21 


0.06 


3 





18 


10 


3 


31 


0.06 


4 


10 








3 


13 


0.21 


5 


10 





10 


3 


23 


0.21 


6 


10 


18 





3 


31 


0.09 


7 


10 


18 


10 


3 


41 


0.09 



Table 6.3 

We then sort on the s,-, the values on the various M,-, to expose more clearly the primitive 
form for X. 

"Primitive form" Values 



i 


s; 


prrii 





3 


0.14 


1 


13 


0.14 


4 


13 


0.21 


2 


21 


0.06 


5 


23 


0.21 


3 


31 


0.06 


6 


31 


0.09 


7 


41 


0.09 



Table 6.4 



The primitive form of X is thus 

X = 3 Im + 13 Im 1 + 13 1 Mt 



21 J 



M 2 



23/ 



Mr, 



31/ 



M t 



31/ 



M 6 



41/ 



M 7 



(6.24) 



We note that the value 13 is taken on on minterms Mj and M4. The probability X has the 
value 13 is thus p (1) + p (4). Similarly, X has value 31 on minterms M3 and M@. 
2. To complete the process of determining the distribution, we list the sorted values and consol- 
idate by adding together the probabilities of the minterms on which each value is taken, as 
follows: 
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k 


t k 


Pk 


1 


3 


0.14 


2 


13 


0.14 + 0.21 = 


= 0.35 


3 


21 


0.06 


4 


23 


0.21 


5 


31 


0.06 + 0.09 = 


= 0.15 


6 


41 


0.09 



Table 6.5 

The results may be put in a matrix X of possible values and a corresponding matrix PX of 
probabilities that X takes on each of these values. Examination of the table shows that 



X= [3 13 21 23 3141] and PX = [0.14 0.35 0.06 0.21 0.15 0.09] 
Matrices X and PX describe the distribution for X. 



(6.25) 



6.1.6 An m-procedure for determining the distribution from affine form 

We now consider suitable MATLAB steps in determining the distribution from affine form, then incorporate 
these in the m-procedure canonic for carrying out the transformation. We start with the random variable 
in affine form, and suppose we have available, or can calculate, the minterm probabilities. 

1. The procedure uses mintable to set the basic minterm vector patterns, then uses a matrix of coefficients, 
including the constant term (set to zero if absent), to obtain the values on each minterm. The minterm 
probabilities are included in a row matrix. 

2. Having obtained the values on each minterm, the procedure performs the desired consolidation by 
using the m-function csort. 

Example 6.10: Steps in determining the distribution for X in Example 6.9 (Finding 
the distribution from affine form) 



» pm = 


minprob 


(0.1* [6 3 5 


J); 


> M = 


mintabl( 


2(3) 




M = 

















l 








1 1 








1 


1 





V 
















> c = 


colcopy( 


:(1:3),8) 




c = 








10 


10 


10 10 


10 


18 


18 


18 18 


18 


10 


10 


10 10 


10 


> CM = 


C.*M 






CM = 









'/, Constant term is listed last 

'/, Minterm vector pattern 

111 
11 

10 1 

'/, An approach mimicking ''hand'' calculation 

'/, Coefficients in position 

10 10 10 
18 18 18 
10 10 10 

'/, Minterm vector values 



145 















10 


10 


10 


10 








18 


18 








18 


18 





10 





10 





10 





10 



> cM = sum(CM) + c(4) '/. Values on minterms 
cM = 

3 13 21 31 13 23 31 41 
°/„ ------------ - '/. Practical MATLAB procedure 

> s = c(l:3)*M + c(4) 
s = 

3 13 21 31 13 23 31 41 

> pm = 0.14 0.14 0.06 0.06 0.21 0.21 0.09 0.09 '/. Extra zeros deleted 

> const = c(4)*ones(l,8) ;} 

3> disp( [CM; const ; s;pm] ') '/, Display of primitive form 

°/. MATLAB gives four decimals 












3 


3 


0.14 








10 


3 


13 


0.14 





18 





3 


21 


0.06 





18 


10 


3 


31 


0.06 


10 








3 


13 


0.21 


10 





10 


3 


23 


0.21 


10 


18 





3 


31 


0.09 


10 


18 


10 


3 


41 


0.09 


> [X,PX] 


= csort(s 


.pm) ; 






> disp([X;PX]' 


) 








3 


0.14 










13 


0.35 










21 


0.06 










23 


0.21 










31 


0.15 










41 


0.09 











'/, Sorting on s, consolidation of pm 
'/, Display of final result 



The two basic steps are combined in the m-procedure canonic, which we use to solve the previous problem. 

Example 6.11: Use of canonic on the variables of Example 6.10 (Steps in determining 
the distribution for X in Example 6.9 (Finding the distribution from afflne form)) 

3> c = [10 18 10 3] ; '/, Note that the constant term 3 must be included last 

> pm = minprob ([0.6 0.3 0.5]); 
3> canonic 

Enter row vector of coefficients c 

Enter row vector of minterm probabilities pm 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 

> disp(XDBN) 



3 


,0000 


0. 


,1400 


13. 


,0000 


0. 


,3500 


21. 


,0000 





,0600 


23. 


,0000 


0. 


.2100 


31. 


,0000 


0. 


,1500 


41. 


,0000 


0. 


,0900 
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With the distribution available in the matrices X (set of values) and PX (set of probabilities), we may 
calculate a wide variety of quantities associated with the random variable. 
We use two key devices: 

1. Use relational and logical operations on the matrix of values X to determine a matrix M which has 
ones for those values which meet a prescribed condition. P (X e M): PM = M*PX' 

2. Determine G = g (X) = [g (Xi) g (X 2 ) ■ ■ ■ g (X n )] by using array operations on matrix X. We have 
two alternatives: 

a. Use the matrix G, which has values g (£,) for each possible value t; for X, or, 

b. Apply csort to the pair (G, PX) to get the distribution for Z = g(X). This distribution (in 
value and probability matrices) may be used in exactly the same manner as that for the original 
random variable X. 

Example 6.12: Continuation of Example 6.11 (Use of canonic on the variables of 
Example 6.10 (Steps in determining the distribution for X in Example 6.9 (Finding 
the distribution from afflne form))) 

Suppose for the random variable X in Example 6.11 (Use of canonic on the variables of Example 6.10 
(Steps in determining the distribution for X in Example 6.9 (Finding the distribution from affine 
form))) it is desired to determine the probabilities 

P (15 < X < 35), P (\X - 20| < 7), and P {{X - 10) {X - 25) > 0). 

> M = (X>=15)&(X<=35); 

M = 00 1 1 1 '/. Ones for minterms on which 15 <= X <= 35 
3> PM = M*PX' °/. Picks out and sums those minterm probs 

PM = 0.4200 

> N = abs(X-20)<=7; 

N = 1 1 1 7, Ones for minterms on which I X - 20 1 <= 7 
3> PN = N*PX' °/. Picks out and sums those minterm probs 

PN = 0.6200 

> G = (X - 10) .*(X - 25) 

G = 154 -36 -44 -26 126 496 '/. Value of g(t_i) for each possible value 

> PI = (G>0)*PX' '/. Total probability for those t_i such that 
PI = 0.3800 '/. g(t_i) > 

> [Z,PZ] = csort (G,PX) */. Distribution for Z = g(X) 
Z = -44 -36 -26 126 154 496 

PZ = 0.0600 0.3500 0.2100 0.1500 0.1400 0.0900 

> P2 = (Z>0)*PZ' '/. Calculation using distribution for Z 
P2 = 0.3800 

Example 6.13: Alternate formulation of Example 3 (Example 4.19) from "Composite 
Trials" 

Ten race cars are involved in time trials to determine pole positions for an upcoming race. To 
qualify, they must post an average speed of 125 mph or more on a trial run. Let E; be the event 
the ith car makes qualifying speed. It seems reasonable to suppose the class {Ei : 1 < i < 10} is 
independent. If the respective probabilities for success are 0.90, 0.88, 0.93, 0.77, 0.85, 0.96, 0.72, 
0.83, 0.91, 0.84, what is the probability that k or more will qualify (k = 6, 7, 8, 9, 10)? 
SOLUTION 
LetX = J2] < L 1 I Et . 

> c = [ones (1,10) 0] ; 

> P = [0.90, 0.88, 0.93, 0.77, 0.85, 0.96, 0.72, 0.83, 0.91, 0.84]; 
3> canonic 
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Enter row vector of coefficients c 

Enter row vector of minterm probabilities minprob(P) 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 

> k = 6:10; 

> for i = l:length(k) 

Pk(i) = (X>=k(i))*PX'; 
end 

> disp(Pk) 

0.9938 0.9628 0.8472 0.5756 0.2114 

This solution is not as convenient to write out. However, with the distribution for X as denned, a great 
many other probabilities can be determined. This is particularly the case when it is desired to compare the 
results of two independent races or "heats." We consider such problems in the study of Independent Classes 
of Random Variables (Section 9.1). 

A function form for canonic 

One disadvantage of the procedure canonic is that it always names the output X and PX. While these 
can easily be renamed, frequently it is desirable to use some other name for the random variable from the 
start. A function form, which we call canonicf, is useful in this case. 

Example 6.14: Alternate solution of Example 6.13 (Alternate formulation of Example 
3 (Example 4.19) from "Composite Trials"), using canonicf 

> c = [10 18 10 3] ; 

> pm = minprob(0.1*[6 3 5]); 
3> [Z,PZ] = canonicf (c,pm) ; 

3> disp( [Z;PZ] ' ) °/. Numbers as before, but the distribution 

°/. matrices are now named Z and PZ 



3. 


.0000 


0. 


,1400 


13. 


,0000 


0. 


.3500 


21. 


,0000 


0. 


,0600 


23. 


,0000 


0. 


.2100 


31 


,0000 


0. 


,1500 


41. 


,0000 


0. 


,0900 



6.1.7 General random variables 

The distribution for a simple random variable is easily visualized as point mass concentrations at the various 
values in the range, and the class of events determined by a simple random variable is described in terms 
of the partition generated by X (i.e., the class of those events of the form Ai = {X = ti} for each t; in the 
range). The situation is conceptually the same for the general case, but the details are more complicated. If 
the random variable takes on a continuum of values, then the probability mass distribution may be spread 
smoothly on the line. Or, the distribution may be a mixture of point mass concentrations and smooth 
distributions on some intervals. The class of events determined by X is the set of all inverse images X~ x (M) 
for M any member of a general class of subsets of subsets of the real line known in the mathematical literature 
as the Borel sets. There are technical mathematical reasons for not saying M is any subset, but the class 
of Borel sets is general enough to include any set likely to be encountered in applications — certainly at the 
level of this treatment. The Borel sets include any interval and any set that can be formed by complements, 
countable unions, and countable intersections of Borel sets. This is a type of class known as a sigma algebra of 
events. Because of the preservation of set operations by the inverse image, the class of events determined by 
random variable X is also a sigma algebra, and is often designated o (X) . There are some technical questions 
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concerning the probability measure Px induced by X, hence the distribution. These also are settled in such 
a manner that there is no need for concern at this level of analysis. However, some of these questions become 
important in dealing with random processes and other advanced notions increasingly used in applications. 
Two facts provide the freedom we need to proceed with little concern for the technical details. 

1. X^ 1 (M) is an event for every Borel set M iff for every semi-infinite interval (— oo, t] on the real line 
X^ 1 ((— oo, t}) is an event. 

2. The induced probability distribution is determined uniquely by its assignment to all intervals of the 
form (—oo, t\. 

These facts point to the importance of the distribution function introduced in the next chapter. 

Another fact, alluded to above and discussed in some detail in the next chapter, is that any general 
random variable can be approximated as closely as pleased by a simple random variable. We turn in the 
next chapter to a description of certain commonly encountered probability distributions and ways to describe 
them analytically. 

6.2 Problems on Random Variables and Probabilities 2 

Exercise 6.1 (Solution on p. 152.) 

The following simple random variable is in canonical form: 
X = -3.757,4 - 1-13/b + 0J C + 2.6I D . 

Express the events {X e (-4,2]},{X e (0,3]}, {X E (-oo,l]}, {\X - 1| > 1}, and {X > 0} in 
terms of A, B, C, and D. 

Exercise 6.2 (Solution on p. 152.) 

Random variable X, in canonical form, is given by X = —21 a — Ib + Ic + 2Id + 5Ie- 

Express the events {X e [2,3)},{X < 0},{X < 0}, {\X - 2| < 3}, and {X 2 > 4}, in terms of 
A, B, C, D, and E. 

Exercise 6.3 (Solution on p. 152.) 

The class {Cj : 1 < j < 10} is a partition. Random variable X has values {1,3,2,3,4,2,1,3,5,2} 
on Cj through Cio, respectively. Express X in canonical form. 

Exercise 6.4 (Solution on p. 152.) 

The class {Cj : 1 < j < 10} in Exercise 6.3 has respective probabilities 0.08, 0.13, 0.06, 0.09, 0.14, 
0.11, 0.12, 0.07, 0.11, 0.09. Determine the distribution for X. 

Exercise 6.5 (Solution on p. 152.) 

A wheel is spun yielding on an equally likely basis the integers 1 through 10. Let C; be the event 
the wheel stops at i, 1 < i < 10. Each P (Cj) = 0.1. If the numbers 1, 4, or 7 turn up, the player 
loses ten dollars; if the numbers 2, 5, or 8 turn up, the player gains nothing; if the numbers 3, 6, or 
9 turn up, the player gains ten dollars; if the number 10 turns up, the player loses one dollar. The 
random variable expressing the results may be expressed in primitive form as 

X = -10/ Cl + 0/ C2 + 10/ C3 - 10/ C4 + 0/c 6 + 10/ C6 - 10/ C7 + 0/ Cs + 10/ C9 - I Cw (6-26) 



Determine the distribution for X, (a) by hand, (b) using MATLAB. 
Determine P {X < 0), P {X > 0). 



Exercise 6.6 (Solution on p. 153.) 

A store has eight items for sale. The prices are $3.50, $5.00, $3.50, $7.50, $5.00, $5.00, $3.50, and 
$7.50, respectively. A customer comes in. She purchases one of the items with probabilities 0.10, 



2 This content is available online at <http://cnx.Org/content/m24208/l.5/>. 
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0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing the amount of her purchase 
may be written 

X = 3.5/ Cl + 5.0/c 2 + 3.5/c., + 7.5Ic 4 + 5.0/ Cb + 5.0I Cb + 3.57 C7 + 7.57 Cs (6.27) 

Determine the distribution for X (a) by hand, (b) using MATLAB. 

Exercise 6.7 (Solution on p. 153.) 

Suppose X, Y in canonical form are 

X = 2I Al + 3I A2 + 5I As Y = I Bl + 2I B2 + 3I Bs (6.28) 

The P(Ai) are 0.3, 0.6, 0.1, respectively, and the P{B i ) are 0.2 0.6 0.2. Each pair {Ai,Bj} is 
independent. Consider the random variable Z = X + Y. Then Z = 2 + 1 on A\B\, Z = 3 + 3 
on A 2 B 3 , etc. Determine the value of Z on each A^Bj and determine the corresponding P (AiBj). 
From this, determine the distribution for Z. 

Exercise 6.8 (Solution on p. 153.) 

For the random variables in Exercise 6.7, let W = XY. Determine the value of W on each AiBj 
and determine the distribution of W. 

Exercise 6.9 (Solution on p. 154.) 

A pair of dice is rolled. 

a. Let X be the minimum of the two numbers which turn up. Determine the distribution for X 

b. Let Y be the maximum of the two numbers. Determine the distribution for Y. 

c. Let Z be the sum of the two numbers. Determine the distribution for Z. 

d. Let W be the absolute value of the difference. Determine its distribution. 

Exercise 6.10 (Solution on p. 154.) 

Minterm probabilities p (0) through p(15) for the class {A, B, C, D} are, in order, 

0.072 0.048 0.018 0.012 0.168 0.112 0.042 0.028 0.062 0.048 0.028 0.010 0.170 0.1^2S).040 0.032 
Determine the distribution for random variable 

X = -5.3I A - 2.57b + 2.37 c + 4.27^ - 3.7 (6.30) 

Exercise 6.11 (Solution on p. 155.) 

On a Tuesday evening, the Houston Rockets, the Orlando Magic, and the Chicago Bulls all have 
games (but not with one another). Let A be the event the Rockets win, B be the event the Magic 
win, and C be the event the Bulls win. Suppose the class {A, B, C} is independent, with respective 
probabilities 0.75, 0.70 0.8. Ellen's boyfriend is a rabid Rockets fan, who does not like the Magic. 
He wants to bet on the games. She decides to take him up on his bets as follows: 

• $10 to 5 on the Rockets — i.e. She loses five if the Rockets win and gains ten if they lose 

• $10 to 5 against the Magic 

• even $5 to 5 on the Bulls. 

Ellen's winning may be expressed as the random variable 

X = -5I A + WI A c + 107 B - 57 B c - 57 c + 5I C . = -151 A + 15I B - 107 c + 10 (6.31) 

Determine the distribution for X. What are the probabilities Ellen loses money, breaks even, or 
comes out ahead? 
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Exercise 6.12 (Solution on p. 155.) 

The class {A, B, C, D} has minterm probabilities 

pm = 0.001 * [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] (6.32) 



• Determine whether or not the class is independent. 

• The random variable X = I a + Ib + Ic + Id counts the number of the events which occur 
on a trial. Find the distribution for X and determine the probability that two or more occur 
on a trial. Find the probability that one or three of these occur on a trial. 

Exercise 6.13 (Solution on p. 156.) 

James is expecting three checks in the mail, for $20, $26, and $33 dollars. Their arrivals are the 
events A,B, C. Assume the class is independent, with respective probabilities 0.90, 0.75, 0.80. 
Then 

X = 20I A + 2QI B + 33/ c (6.33) 

represents the total amount received. Determine the distribution for X. What is the probability 
he receives at least $50? Less than $30? 

Exercise 6.14 (Solution on p. 156.) 

A gambler places three bets. He puts down two dollars for each bet. He picks up three dollars (his 
original bet plus one dollar) if he wins the first bet, four dollars if he wins the second bet, and six 
dollars if he wins the third. His net winning can be represented by the random variable 

X = 3I A + 4:I B + 61c ~ 6, with P (A) = 0.5, P(B)=0A, P (C) = 0.3 (6.34) 

Assume the results of the games are independent. Determine the distribution for X. 

Exercise 6.15 (Solution on p. 157.) 

Henry goes to a hardware store. He considers a power drill at $35, a socket wrench set at $56, 
a set of screwdrivers at $18, a vise at $24, and hammer at $8. He decides independently on the 
purchases of the individual items, with respective probabilities 0.5, 0.6, 0.7, 0.4, 0.9. Let X be the 
amount of his total purchases. Determine the distribution for X. 

Exercise 6.16 (Solution on p. 158.) 

A sequence of trials (not necessarily independent) is performed. Let E; be the event of success on 
the ith component trial. We associate with each trial a "payoff function" Xi = ai^ + Me9- Thus, 
an amount a is earned if there is a success on the trial and an amount b (usually negative) if there 
is a failure. Let S n be the number of successes in the n trials and W be the net payoff. Show that 
W = (a - b) S n + bn. 

Exercise 6.17 (Solution on p. 158.) 

A marker is placed at a reference position on a line (taken to be the origin); a coin is tossed 
repeatedly. If a head turns up, the marker is moved one unit to the right; if a tail turns up, the 
marker is moved one unit to the left. 

a. Show that the position at the end of ten tosses is given by the random variable 

10 10 10 

x = j^ ie< - J2 Ie i = 2 J2 /b * - 10 = 25i ° - 10 ( 6 - 35 ) 

i— 1 i— 1 2—1 

where E; is the event of a head on the ith toss and Sio is the number of heads in ten trials. 

b. After ten tosses, what are the possible positions and the probabilities of being in each? 
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Exercise 6.18 (Solution on p. 158.) 

Margaret considers five purchases in the amounts 5, 17, 21, 8, 15 dollars with respective probabilities 
0.37, 0.22, 0.38, 0.81, 0.63. Anne contemplates six purchases in the amounts 8, 15, 12, 18, 15, 12 
dollars, with respective probabilities 0.77, 0.52, 0.23, 0.41, 0.83, 0.58. Assume that all eleven 
possible purchases form an independent class. 

a. Determine the distribution for X, the amount purchased by Margaret. 

b. Determine the distribution for Y, the amount purchased by Anne. 

c. Determine the distribution for Z = X + Y, the total amount the two purchase. 

Suggestion for part (c). Let MATLAB perform the calculations. 

[r,s] = ndgrid(X,Y); 
[t,u] = ndgrid(PX,PY); 
z = r + s; 
pz = t.*u; 
[Z,PZ] = csort(z,pz) ; 
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Solutions to Exercises in Chapter 6 

Solution to Exercise 6.1 (p. 148) 

• A\JB\JC 

• D 

• A\JB\JC 

• C 

• C\/D 

Solution to Exercise 6.2 (p. 148) 

• D 

• A\J B 

• A\J B 

• B\JC\JD\JE 

• A\JD\JE 

Solution to Exercise 6.3 (p. 148) 

T = [132342135 2]; 
[X,I] = sort(T) 

X=1122233345 
1=1736 10 24859 

X = I A + 21 B + 2,1c + 4/ D + 5I E (6.36) 

A = d\JC 7 , B = C 3 \J C 6 \J C 10 , C = C 2 \J d\/ C 8 , D = C h , E = C 9 (6.37) 

Solution to Exercise 6.4 (p. 148) 

T= [132342135 2]; 
pc = 0.01*[8 13 6 9 14 11 12 7 11 9] ; 



[X,PX] = cs 


ort(T.pc) ; 


disp([X;PX] 


') 


1.0000 


0.2000 


2.0000 


0.2600 


3.0000 


0.2900 


4.0000 


0.1400 


5.0000 


0.1100 



Solution to Exercise 6.5 (p. 148) 

p = 0.1*ones(l,10) ; 
c = [-10 10 -10 10 -10 10 -1]; 
[X,PX] = csort(c,p) ; 
disp([X;PX]') 

-10.0000 0.3000 

-1.0000 0.1000 

0.3000 

10.0000 0.3000 
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Pneg = (X<0)*PX' 

Pneg = 0.4000 

Ppos = (X>0)*PX' 

Ppos = 0.300 

Solution to Exercise 6.6 (p. 148) 



p = 0.01*[10 15 15 20 10 5 10 15]; 
c = [3.5 5 3.5 7.5 5 5 3.5 7.5]; 
[X,PX] = csort(c,p) ; 
disp([X;PX]') 

3.5000 0.3500 

5.0000 0.3000 

7.5000 0.3500 

Solution to Exercise 6.7 (p. 149) 



A = [2 3 5]; 
B = [12 3]; 
a = rowcopy (A, 3) ; 
b = colcopy (B,3) ; 
Z =a + b 
Z = 3 4 6 

4 5 7 

5 6 8 
PA = [0.3 0.6 0.1] 
PB = [0.2 0.6 0.2] 

pa= rowcopy (PA, 3) 
pb = colcopy (PB, 3) ; 
P = pa . *pb 
P = 0.0600 0.1200 

0.1800 0.3600 

0.0600 0.1200 

[Z,PZ] = csort(Z,P) ; 
disp([Z;PZ]') 

3.0000 0.0600 

4.0000 0.3000 

5.0000 0.4200 



°/. Possible values of sum Z = X + Y 



'/, Probabilities for various values 
0.0200 
0.0600 
0.0200 

'/. Distribution for Z = X + Y 



6 


0000 





1400 


7 


0000 





0600 


8 


0000 





0200 



Solution to Exercise 6.8 (p. 149) 



XY = a.*b 

XY = 2 3 5 

4 6 10 

6 9 15 



'/„ XY values 



PW 



'/, Distribution for W 



XY 
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2. 
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,0000 
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,0000 
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Solution to Exercise 6.9 (p. 149) 





t = 


1:6; 
















c = 


= ones 


(6,6); 
















[x ; 


,y] = 


meshgrid(t ,t) 
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= i 
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4 


5 


6 




'/, x-values in each position 
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5 
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5 
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i 


2 


3 


4 


5 
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2 


3 


4 


5 
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i 


2 


3 


4 


5 


6 








y = 


= i 


1 


1 


1 


1 


1 




°/. y-values in each position 




2 


2 


2 


2 


2 


2 










3 


3 


3 


3 


3 


3 










4 


4 


4 


4 


4 


4 










5 


5 


5 


5 


5 


5 










6 


6 


6 


6 


6 


6 








m » 


■ min( 


x,y) ; 










'/. 


min in 


each position 


M = 


= max( 


x,y) ; 










'/. 


max in 


each position 


s = 


= x + 


y; 










'/. 


sum x+y 


■ in each position 


d = 


= abs( 


x - y) ; 










'/. 


lx - yl 


in each position 


IX 


,fX] = 


: csort(m 


,c) 








'/. 


sorts values and counts occurrences 


X = 


1 


2 


3 


4 


5 


6 








fX 


= 11 


9 


7 


5 


3 


1 




'/. PX = 


■ fX/36 


CY, 


,fY] = 


: csort(M 


,c) 














Y = 
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5 


6 








fY 


= 1 


3 


5 


7 


9 


11 




'/. PY = 


■ fY/36 


Cz, 


,fZ] = 


: csort(s 


,c) 














z = 
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5 


6 


7 
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fZ 


= 1 
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5 


6 




5 


4 3 2 1 7.PZ = fZ/36 


[W,fW] = 


■ csort(d 


,c) 














w = 
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2 


3 


4 


5 








fW 


= 6 


10 


8 


6 


4 


2 




'/. PW = 


■ fW/36 



Solution to Exercise 6.10 (p. 149) 



'/„ file npr06_10.m (Section~17 .8 . 27: npr06_10) 
'/, Data for Exercise~6 . 10 
pm =[ 0.072 0.048 0.018 0.012 0.168 0.112 0.042 0.028 ... 

0.062 0.048 0.028 0.010 0.170 0.110 0.040 0.032]; 
c = [-5.3 -2.5 2.3 4.2 -3.7]; 
disp('Minterm probabilities are in pm, coefficients in c') 
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npr06_10 (Section~17 . 8 . 27 : npr06_10) 

Minterm probabilities are in pm, coefficients in c 

canonic 
Enter row vector of coefficients c 
Enter row vector of minterm probabilities pm 

Use row matrices X and PX for calculations 

Call for XDBN to view the distribution 

XDBN 



XDBN = 




-11.5000 


0.1700 


-9.2000 


0.0400 


-9.0000 


0.0620 


-7.3000 


0.1100 


-6.7000 


0.0280 


-6.2000 


0.1680 


-5.0000 


0.0320 


-4.8000 


0.0480 


-3.9000 


0.0420 


-3.7000 


0.0720 


-2.5000 


0.0100 


-2.0000 


0.1120 


-1.4000 


0.0180 


0.3000 


0.0280 


0.5000 


0.0480 


2.8000 


0.0120 



Solution to Exercise 6.11 (p. 149) 

P = 0.01*[75 70 80] ; 
c = [-15 15 -10 10]; 
canonic 
Enter row vector of coefficients c 

Enter row vector of minterm probabilities minprob(P) 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 
disp(XDBN) 

-15.0000 0.1800 

-5.0000 0.0450 

0.4800 

10.0000 0.1200 

15.0000 0.1400 

25.0000 0.0350 

PXneg = (X<0)*PX' 
PXneg = 0.2250 
PX0 = (X==0)*PX' 
PX0 = 0.4800 

PXpos = (X>0)*PX' 
PXpos = 0.2950 

Solution to Exercise 6.12 (p. 150) 



156 CHAPTER 6. RANDOM VARIABLES AND PROBABILITIES 

npr06_12 (Section~17. 8 . 28: npr06_12) 
Minterm probabilities in pm, coefficients in c 
a = imintest(pm) 
The class is NOT independent 
Minterms for which the product rule fails 
a = 

1111 
1111 
1111 
1111 
canonic 
Enter row vector of coefficients c 
Enter row vector of minterm probabilities pm 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 
XDBN = 

0.0050 

1.0000 0.0430 

2.0000 0.2120 

3.0000 0.4380 

4.0000 0.3020 

P2 = (X>=2)*PX' 
P2 = 0.9520 

P13 = ((X==l) I (X==3))*PX' 
P13 = 0.4810 

Solution to Exercise 6.13 (p. 150) 

c = [20 26 33 0] ; 
P = 0.01*[90 75 80]; 
canonic 
Enter row vector of coefficients c 

Enter row vector of minterm probabilities minprob(P) 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 
disp(XDBN) 

0.0050 

20.0000 0.0450 

26.0000 0.0150 

33.0000 0.0200 

46.0000 0.1350 

53.0000 0.1800 

59.0000 0.0600 

79.0000 0.5400 

P50 = (X>=50)*PX' 
P50 = 0.7800 
P30 = (X <30)*PX' 
P30 = 0.0650 

Solution to Exercise 6.14 (p. 150) 



157 



c = [3 4 6-6]; 
P = 0.1* [5 4 3]; 
canonic 
Enter row vector of coefficients c 

Enter row vector of minterm probabilities minprob(P) 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 
dsp(XDBN) 

-6.0000 0.2100 

-3.0000 0.2100 

-2.0000 0.1400 

0.0900 

1.0000 0.1400 

3.0000 0.0900 

4.0000 0.0600 

7.0000 0.0600 

Solution to Exercise 6.15 (p. 150) 



c = [35 56 18 24 8 0] ; 
P = 0.1* [5 6 7 4 9]; 
canonic 
Enter row vector of coefficients c 

Enter row vector of minterm probabilities minprob(P) 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 
disp(XDBN) 

0.0036 
8.0000 0.0324 

18.0000 0.0084 

24.0000 0.0024 

26.0000 0.0756 

32.0000 0.0216 

35.0000 0.0036 

42.0000 0.0056 

43.0000 0.0324 

50.0000 0.0504 

53.0000 0.0084 

56.0000 0.0054 

59.0000 0.0024 

61.0000 0.0756 

64.0000 0.0486 

67.0000 0.0216 

74.0000 0.0126 

77.0000 0.0056 

80.0000 0.0036 

82.0000 0.1134 

85.0000 0.0504 

88.0000 0.0324 

91.0000 0.0054 

98.0000 0.0084 

99.0000 0.0486 
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106.0000 0.0756 

109.0000 0.0126 

115.0000 0.0036 

117.0000 0.1134 

123.0000 0.0324 

133.0000 0.0084 

141.0000 0.0756 

Solution to Exercise 6.16 (p. 150) 

X, = aI Ei + b(l- I Ei ) = (a-b) I Ei + b 

n n 

W = J2 x t = ( a -b) X/£i +bn={a-b) S n + bn 

i=l i=\ 

Solution to Exercise 6.17 (p. 150) 

Xi = I Et ~ Isf = Ie, - (1 - Ie,) = 2I Ei - 1 

10 n 



X = Y^Xi = 2j2l Ei - 10 



(6.38) 
(6.39) 

(6.40) 
(6.41) 



S = 0:10; 




PS = ibinom(10, 0.5, 0:10); 


X = 2*S - 10; 




disp([X;PS]') 




-10.0000 


0.0010 


-8.0000 


0.0098 


-6.0000 


0.0439 


-4.0000 


0.1172 


-2.0000 


0.2051 





0.2461 


2.0000 


0.2051 


4.0000 


0.1172 


6.0000 


0.0439 


8.0000 


0.0098 


10.0000 


0.0010 



Solution to Exercise 6.18 (p. 151) 



'/. file npr06_18.m (Section~17 .8 . 29: npr06_18.m) 
ex = [5 17 21 8 15 0] ; 
cy = [8 15 12 18 15 12 0] ; 
pmx = minprob(0.01*[37 22 38 81 63]); 
pmy = minprob(0.01*[77 52 23 41 83 58]); 
npr06_18 (Section~17 . 8 . 29 : npr06_18.m) 
[X,PX] = canonicf (ex, pmx) ; [Y,PY] = canonicf (cy,pmy) ; 
[r,s] = ndgrid(X,Y); [t,u] = ndgrid(PX,PY) ; 
z=r+s; pz=t.*u; 
[Z,PZ] = csort(z,pz) ; 
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a = length (Z) 

a = 125 '/, 125 different values 

plot (Z, cumsum(PZ) ) '/, See figure Plotting details omitted 
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Chapter 7 

Distribution and Density Functions 



7.1 Distribution and Density Functions 1 

7.1.1 Introduction 

In the unit on Random Variables and Probability (Section 6.1) we introduce real random variables as map- 
pings from the basic space 0, to the real line. The mapping induces a transfer of the probability mass on 
the basic space to subsets of the real line in such a way that the probability that X takes a value in a set 
M is exactly the mass assigned to that set by the transfer. To perform probability calculations, we need to 
describe analytically the distribution on the line. For simple random variables this is easy. We have at each 
possible value of X a point mass equal to the probability X takes that value. For more general cases, we 
need a more useful description than that provided by the induced probability measure Px- 

7.1.2 The distribution function 

In the theoretical discussion on Random Variables and Probability (Section 6.1), we note that the probability 
distribution induced by a random variable X is determined uniquely by a consistent assignment of mass to 
semi-infinite intervals of the form (— oo, t] for each real t. This suggests that a natural description is provided 
by the following. 

Definition 

The distribution function Fx for random variable X is given by 

F x (t) = P(X < t) = P{Xe (-oo, t}) VieR (7.1) 

In terms of the mass distribution on the line, this is the probability mass at or to the left of the point t. As 
a consequence, Fx has the following properties: 

(Fl) : Fx must be a nondecreasing function, for if t > s there must be at least as much probability mass 
at or to the left of t as there is for s. 

(F2) : Fx is continuous from the right, with a jump in the amount po at to iff P (X = to) = po. If the point 
t approaches to from the left, the interval does not include the probability mass at to until t reaches 
that value, at which point the amount at or to the left of t increases ("jumps") by amount po\ on the 
other hand, if t approaches to from the right, the interval includes the mass po all the way to and 
including t , but drops immediately as t moves to the left of t . 

(F3) : Except in very unusual cases involving random variables which may take "infinite" values, the prob- 
ability mass included in (— oo, t] must increase to one as t moves to the right; as t moves to the left, 



1 This content is available online at <http://cnx.Org/content/m23267/l.6/>. 
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the probability mass included must decrease to zero, so that 

F x (-oo)= UmF x (t)=0 and F x (oo) = Urn F x (t) = 1 (7.2) 

t — * — oo t — *oo 

A distribution function determines the probability mass in each semiinfmite interval (— oo,£]. According to 
the discussion referred to above, this determines uniquely the induced distribution. 

The distribution function F x for a simple random variable is easily visualized. The distribution consists 
of point mass p; at each point t; in the range. To the left of the smallest value in the range, F x (t) = 0; as t 
increases to the smallest value tj, F x (t) remains constant at zero until it jumps by the amount pj.. F x (t) 
remains constant at pj until t increases to £2, where it jumps by an amount P2 to the value p\ + p2- This 
continues until the value of F x (i)reaches 1 at the largest value t n . The graph of F x is thus a step function, 
continuous from the right, with a jump in the amount p; at the corresponding point t; in the range. A 
similar situation exists for a discrete- valued random variable which may take on an infinity of values (e.g., 
the geometric distribution or the Poisson distribution considered below). In this case, there is always some 
probability at points to the right of any <;,-, but this must become vanishingly small as t increases, since the 
total probability mass is one. 

The procedure ddbn may be used to plot the distributon function for a simple random variable from a 
matrix X of values and a corresponding matrix PX of probabilities. 

Example 7.1: Graph of F x for a simple random variable 

> c = [10 18 10 3]; '/. Distribution for X in Example 6.5.1 

> pm = minprob(0.1*[6 3 5]); 
3> canonic 

Enter row vector of coefficients c 

Enter row vector of minterm probabilities pm 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 

^> ddbn '/, Circles show values at jumps 

Enter row matrix of VALUES X 
Enter row matrix of PROBABILITIES PX 
7, Printing details See Figure~7.1 
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Distribution Function F 




Figure 7.1: Distribution function for Example 7.1 (Graph of Fx for a simple random variable) 



7.1.3 Description of some common discrete distributions 

We make repeated use of a number of common distributions which are used in many practical situations. 
This collection includes several distributions which are studied in the chapter "Random Variables and Prob- 
abilities" (Section 6.1). 

1. Indicator function. X = I E P (X = 1) = P (E) = pP (X = 0) = q = 1 - p. The distribution 
function has a jump in the amount q at t = and an additional jump of p to the value 1 at t = 1. 

2. Simple random variable X = $Z i=1 UlAt (canonical form) 



P(X = t i ) = P(A i 



Pi 



(7.3) 



The distribution function is a step function, continuous from the right, with jump of p; at t = ti (See 
Figure 7.1 for Example 7.1 (Graph of Fx for a simple random variable)) 
3. Binomial (n,p). This random variable appears as the number of successes in a sequence of n Bernoulli 
trials with probability p of success. In its simplest form 



X = y lEi with {Ei : 1 < i < n} independent 



i=l 



(7.4) 



P{E t ) 



P 



p( X = k) = C{n, k)p k q 



k „n—k 



(7.5) 



As pointed out in the study of Bernoulli sequences in the unit on Composite Trials, two m-functions 
ibinom andcbinom are available for computing the individual and cumulative binomial probabilities. 
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4. Geometric (p) There are two related distributions, both arising in the study of continuing Bernoulli 
sequences. The first counts the number of failures before the first success. This is sometimes called 
the "waiting time." The event {X = k} consists of a sequence of k failures, then a success. Thus 

P(X = k) = q k p, 0<k (7.6) 

The second designates the component trial on which the first success occurs. The event {Y = k} 
consists of k — 1 failures, then a success on the icth component trial. We have 

P (Y = k) = q k - 1 p, \<k (7.7) 

We say X has the geometric distribution with parameter (p), which we often designate by X ~ 
geometric (p). Now Y = X + lorY— 1 = X. For this reason, it is customary to refer to the 
distribution for the number of the trial for the first success by saying Y — 1 ~ geometric (p). The 
probability of k or more failures before the first success is P (X > k) = q k . Also 

P(X > n + k\X > n) = P{ *^l +] * ) = Q n+k /q n = q k = P (X > k) (7.8) 

P (X > n) 

This suggests that a Bernoulli sequence essentially "starts over" on each trial. If it has failed n times, 
the probability of failing an additional k or more times before the next success is the same as the initial 
probability of failing k or more times before the first success. 

Example 7.2: The geometric distribution 

A statistician is taking a random sample from a population in which two percent of the 

members own a BMW automobile. She takes a sample of size 100. What is the probability 

of finding no BMW owners in the sample? 

SOLUTION 

The sampling process may be viewed as a sequence of Bernoulli trials with probability p = 0.02 

of success. The probability of 100 or more failures before the first success is 0.98 100 = 0.1326 

or about 1/7.5. 

5. Negative binomial (m,p). X is the number of failures before the mth success. It is generally more 
convenient to work with Y = X + m, the number of the trial on which the mth success occurs. An 
examination of the possible patterns and elementary combinatorics show that 

P(Y = k) = C(k-l,m- l)p m q k - m , m < k (7.9) 

There are m — 1 successes in the first k — 1 trials, then a success. Each combination has probability 
p m q k - m _ We have an m-function nbinom to calculate these probabilities. 

Example 7.3: A game of chance 

A player throws a single six-sided die repeatedly. He scores if he throws a 1 or a 6. What is 
the probability he scores five times in ten or fewer throws? 

3>~p~=~ sum (nbinom (5, 1/3,5: 10)) 
p~~=~~0.2131 

An alternate solution is possible with the use of the binomial distribution. The mth success 
comes not later than the icth trial iff the number of successes in k trials is greater than or 
equal to m. 

>~p~=~cbinom(10, 1/3,5) 
P~~=~~0.2131 
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6. Poisson (//). This distribution is assumed in a wide variety of applications. It appears as a counting 
variable for items arriving with exponential interarrival times (see the relationship to the gamma 
distribution below). For large n and small p (which may not be a value found in a table), the binomial 
distribution is approximately Poisson (np). Use of the generating function (see Transform Methods) 
shows the sum of independent Poisson random variables is Poisson. The Poisson distribution is integer 
valued, with 

k 

P{X = k) = e^^— < k (7.10) 

Although Poisson probabilities are usually easier to calculate with scientific calculators than binomial 
probabilities, the use of tables is often quite helpful. As in the case of the binomial distribution, we have 
two m-functions for calculating Poisson probabilities. These have advantages of speed and parameter 
range similar to those for ibinom and cbinom. 

: P (X = k) is calculated by P = ipoisson(mu,k), where k is a row or column vector of integers and 

the result P is a row matrix of the probabilities. 
: P (X > k) is calculated by P = cpoisson(mu,k), where k is a row or column vector of integers and 

the result P is a row matrix of the probabilities. 

Example 7.4: Poisson counting random variable 

The number of messages arriving in a one minute period at a communications network 
junction is a random variable N ~ Poisson (130). What is the probability the number of 
arrivals is greater than equal to 110, 120, 130, 140, 150, 160 ? 

>~p~=~cpoisson(130, 110: 10 : 160) 
p~~=~~0.9666~~0.8209~~0.5117~~0.2011~~0.0461~~0.0060 

The descriptions of these distributions, along with a number of other facts, are summarized 
in the table DATA ON SOME COMMON DISTRIBUTIONS in Appendix C (Section 17.3). 



7.1.4 The density function 

If the probability mass in the induced distribution is spread smoothly along the real line, with no point mass 
concentrations, there is a probability density function fx which satisfies 

P{X e M) = P x (M) = I f x (t) dt (area under the graph of f x over M) (7.11) 

J M 

At each t, fx (t) is the mass per unit length in the probability distribution. The density function has three 
characteristic properties: 

(fl) fx > (f2) / f x = 1 (f3) F X (t) = f fx (7.12) 

A random variable (or distribution) which has a density is called absolutely continuous. This term comes 
from measure theory. We often simply abbreviate as continuous distribution. 

Remarks 

1. There is a technical mathematical description of the condition "spread smoothly with no point mass 
concentrations." And strictly speaking the integrals are Lebesgue integrals rather than the ordinary 
Riemann kind. But for practical cases, the two agree, so that we are free to use ordinary integration 
techniques. 

2. By the fundamental theorem of calculus 

fx (t) = F x (t) at every point of continuity of fx (7-13) 
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3. Any integrable, nonnegative function / with / / = 1 determines a distribution function F, which 
in turn determines a probability distribution. If J f ^ 1 ; multiplication by the appropriate positive 
constant gives a suitable /. An argument based on the Quantile Function shows the existence of a 
random variable with that distribution. 

4. In the literature on probability, it is customary to omit the indication of the region of integration when 
integrating over the whole line. Thus 



9 (t) fx (t) dt 



g(t)fx(t)dt 



(7.14) 



R 



The first expression is not an indefinite integral. In many situations, fx will be zero outside an 
interval. Thus, the integrand effectively determines the region of integration. 



Weibull(alpha,lambda) density for alpha = 2 




Figure 7.2: The Weibull density for a = 2, A = 0.25, 1, 4. 



167 



Weibull density for alpha = 10. 




Figure 7.3: The Weibull density for a = 10, A = 0.001, 1, 1000. 



7.1.5 Some common absolutely continuous distributions 

1. Uniform (a, b). 

Mass is spread uniformly on the interval [a, b]. It is immaterial whether or not the end points are 
included, since probability associated with each individual point is zero. The probability of any subin- 
terval is proportional to the length of the subinterval. The probability of being in any two subintervals 
of the same length is the same. This distribution is used to model situations in which it is known that 
X takes on values in [a, b] but is equally likely to be in any subinterval of a given length. The density 
must be constant over the interval (zero outside), and the distribution function increases linearly with 
t in the interval. Thus, 



fx(t) 



a < t < b (zero outside the interval) 



(7.15) 



The graph of Fx rises linearly, with slope 1/(6 



2. Symmetric triangular (—a, a), fx (t) = { 



(a- 



a) from zero at t = a to one at t = b. 
t) J a 2 -a<t < 

(a-t) J a 2 <t<a 

This distribution is used frequently in instructional numerical examples because probabilities can be 
obtained geometrically. It can be shifted, with a shift of the graph, to different sets of values. It 
appears naturally (in shifted form) as the distribution for the sum or difference of two independent 
random variables uniformly distributed on intervals of the same length. This fact is established with 
the use of the moment generating function (see Transform Methods). More generally, the density may 
have a triangular graph which is not symmetric. 
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Example 7.5: Use of a triangular distribution 

Suppose X ~ symmetric triangular (100, 300). Determine P(120 < X < 250). 

Remark. Note that in the continuous case, it is immaterial whether the end point of the 

intervals are included or not. 

SOLUTION 

To get the area under the triangle between 120 and 250, we take one minus the area of the 

right triangles between 100 and 120 and between 250 and 300. Using the fact that areas of 

similar triangles are proportional to the square of any side, we have 

P = 1 - - ((20/100) 2 + (50/100) 2 ) = 0.855 (7.16) 



3. Exponential (A) fx (t) = \e~ xt t > (zero elsewhere). 

Integration shows Fx (£) = 1 — e~ xt t > (zero elsewhere). We note that P ( X > t) = 1 — Fx (t) = 
e~ xt t > 0. This leads to an extremely important property of the exponential distribution. Since 
X > t+ h, h > implies X > t, we have 

P {X > t + h\X > t) = P {X > t + h) jP (X > t) = e - x{t+h) /e- xt = e- xh = P(X > h) (7.17) 

Because of this property, the exponential distribution is often used in reliability problems. Suppose 
X represents the time to failure (i.e., the life duration) of a device put into service at t = 0. If the 
distribution is exponential, this property says that if the device survives to time t (i.e., X > t) then 
the (conditional) probability it will survive h more units of time is the same as the original probability 
of surviving for h units of time. Many devices have the property that they do not wear out. Failure 
is due to some stress of external origin. Many solid state electronic devices behave essentially in this 
way, once initial "burn in" tests have removed defective units. Use of Cauchy's equation (Appendix B) 
shows that the exponential distribution is the only continuous distribution with this property. 

4. Gamma distribution (a, A) fx (t) = x t r(a e ) — t > (zero elsewhere) 

We have an m-function gammadbn to determine values of the distribution function for X ~ gamma 
(a, A). Use of moment generating functions shows that for a = n, a random variable X ~ gamma 
(n, A) has the same distribution as the sum of n independent random variables, each exponential (A). 
A relation to the Poisson distribution is described in Sec 7.5. 

Example 7.6: An arrival problem 

On a Saturday night, the times (in hours) between arrivals in a hospital emergency unit may 
be represented by a random quantity which is exponential (A = 3). As we show in the chapter 
Mathematical Expectation (Section 11.1), this means that the average interarrival time is 1/3 
hour or 20 minutes. What is the probability of ten or more arrivals in four hours? In six 
hours? 

SOLUTION 

The time for ten arrivals is the sum of ten interarrival times. If we suppose these are inde- 
pendent, as is usually the case, then the time for ten arrivals is gamma (10, 3). 

>~p~=~gammadbn(10,3, [4~6]) 
p~~=~~0 . 7576 . 9846 

5. Normal, or Gaussian (/i, (Jjfxif) = — 7^ =ex P \\\~ir) ) ^ 

We generally indicate that a random variable X has the normal or gaussian distribution by writing 
X ~ N (/x, it 2 ), putting in the actual values for the parameters. The gaussian distribution plays a 
central role in many aspects of applied probability theory, particularly in the area of statistics. Much 
of its importance comes from the central limit theorem (CLT), which is a term applied to a number 
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of theorems in analysis. Essentially, the CLT shows that the distribution for the sum of a sufficiently 
large number of independent random variables has approximately the gaussian distribution. Thus, the 
gaussian distribution appears naturally in such topics as theory of errors or theory of noise, where the 
quantity observed is an additive combination of a large number of essentially independent quantities. 
Examination of the expression shows that the graph for fx (i) is symmetric about its maximum at 
t = (x. The greater the parameter a 2 , the smaller the maximum value and the more slowly the curve 
decreases with distance from [i. Thus parameter /i locates the center of the mass distribution and a 2 
is a measure of the spread of mass about /x. The parameter /z is called the mean value and a 2 is the 
variance. The parameter a, the positive square root of the variance, is called the standard deviation. 
While we have an explicit formula for the density function, it is known that the distribution function, 
as the integral of the density function, cannot be expressed in terms of elementary functions. The 
usual procedure is to use tables obtained by numerical integration. 

Since there are two parameters, this raises the question whether a separate table is needed for each 
pair of parameters. It is a remarkable fact that this is not the case. We need only have a table of the 
distribution function for X ~ N (0, 1). This is refered to as the standardized normal distribution. We 
use (j) and $ for the standardized normal density and distribution functions, respectively. 
Standardized normal^ (t) = -hr=e~ l ^ 2 so that the distribution function is $ (t) = J_ </> (u) du. 
The graph of the density function is the well known bell shaped curve, symmetrical about the origin 
(see Figure 7.4). The symmetry about the origin contributes to its usefulness. 

P (X < t) = $ (t) = area under the curve to the left of t (7.18) 

Note that the area to the left of t = —1.5 is the same as the area to the right of t = 1.5, so that 
$ ( — 2) = 1 — $ (2). The same is true for any t, so that we have 

$(-*) = l-$(i) Vt (7.19) 

This indicates that we need only a table of values of $ (i) for t > to be able to determine $ (£) for 
any t. We may use the symmetry for any case. Note that $ (0) = 1/2, 
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Density function for the standardized normal distribution 
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Figure 7.4: The standardized normal distribution. 
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Example 7.7: Standardized normal calculations 

Suppose X ~ N (0, 1). Determine P (-1 < X < 2) and P(\X\ > 1). 
SOLUTION 

1. P(-l < X < 2) = $(2) -$(-1) = $(2) - [1 -$(1)] = $(2) + $(l) - 1 

2. P(\X\ > 1) = P(X > 1) + P(X < -1) = l-$(l) + $(-l) = 2[l-$(l)] 

Prom a table of standardized normal distribution function (see Appendix D (Section 17.4)), 

we find 

$ (2) = 0.9772 and $ (1) = 0.8413 which gives P (-1 < X < 2) = 0.8185 and P (\X\ > 1) = 

0.3174 

General gaussian distribution 

For X ~ N (fj,, a 2 ), the density maintains the bell shape, but is shifted with different spread and 
height. Figure 7.5 shows the distribution function and density function for X ~ 7V (2, 0.1). The 
density is centered about t = 2. It has height 1.2616 as compared with 0.3989 for the standardized 
normal density. Inspection shows that the graph is narrower than that for the standardized normal. 
The distribution function reaches 0.5 at the mean value 2. 
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Density and Distribtion Function for X normal (2,0.1). 




Figure 7.5: The normal density and distribution functions for X ~ N (2, 0.1). 



A change of variables in the integral shows that the table for standardized normal distribution function 
can be used for any case. 



**(*) = -4= / M-\\-ir 

aV2ir 7-co V 2 \ & 



dx 



4> I I —dx 



Make the change of variable and corresponding formal changes 



x-/x 1 t-fx 

u = au = —ax x = t ~ u = 

a a a 



to get 



Fx(t) 



(.t-ii)/o 



4> (u) du = $ 



t — n 



(7.20) 



(7.21) 



(7.22) 



Example 7.8: General gaussian calculation 

Suppose X ~ AT (3, 16) (i.e., \x = 3 and a 2 = 16). Determine P (-1 < X < 11) and 

P(|A-3|>4). 

SOLUTION 

a. F x (11) - F x (-1) = $ (^=3) - $ (=±±) = $ (2) - $ (-1) = 0.8185 

b. P(X-3< -4) + P(A-3>4) = F x {-l) + [l - F x (7)} = $ (-l) + l-$ (1) = 0.3174 

In each case the problem reduces to that in Example 7.7 (Standardized normal calculations) 

We have m-functions gaussian and gaussdensity to calculate values of the distribution and density 
function for any reasonable value of the parameters. 
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The following are solutions of Example 7.7 (Standardized normal calculations) and Example 7.8 (Gen- 
eral gaussian calculation), using the m-function gaussian. 

Example 7.9: Example 7.7 (Standardized normal calculations) and Example 7.8 
(General gaussian calculation) (continued) 

3> PI = gaussian (0,1, 2) - gaussian(0, 1, -1) 
PI = 0.8186 

> P2 = 2*(1 - gaussian(0,l,l)) 
P2 = 0.3173 

3> PI = gaussian(3, 16, 11) - gaussian(3, 16,-1) 
P2 = 0.8186 

> P2 = gaussian(3,16,-l)) + 1 - (gaussian(3, 16,7) 
P2 = 0.3173 

The differences in these results and those above (which used tables) are due to the roundoff 
to four places in the tables. 

6. Beta(r, a), r > 0, s > 0. f x (i) = ^gy^l - t) 3 ' 1 < t < 1 
Analysis is based on the integrals 

u r -\l-u) s - 1 du= EMlifJ with r (t + 1) = tT (t) (7.23) 

o T (r + s) 

Figure 7.6 and Figure 7.7 show graphs of the densities for various values of r, s. The usefulness comes 
in approximating densities on the unit interval. By using scaling and shifting, these can be extended 
to other intervals. The special case r = s = 1 gives the uniform distribution on the unit interval. The 
Beta distribution is quite useful in developing the Bayesian statistics for the problem of sampling to 
determine a population proportion. If r, s are integers, the density function is a polynomial. For the 
general case we have two m-functions, beta and betadbn to perform the calculatons. 
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Figure 7.6: The Beta(r,s) density for r = 2, s= 1, 2, 10. 
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Beta(r,s) density — r = 5 




Figure 7.7: The Beta(r,s) density for r — 5, s — 2, 5, 10. 



7. Weibull(a, A, v) F x (t) = 1 - e'^*-^" a > 0, A > 0, i/ > 0, t>v 

The parameter v is a shift parameter. Usually we assume v = 0. Examination shows that for 
a = 1 the distribution is exponential (A). The parameter a provides a distortion of the time scale for 
the exponential distribution. Figure 7.6 and Figure 7.7 show graphs of the Weibull density for some 
representative values of a and A [y = 0). The distribution is used in reliability theory. We do not make 
much use of it. However, we have m-functions weibull (density) and weibulld (distribution function) 
for shift parameter v = only. The shift can be obtained by subtracting a constant from the t values. 



7.2 Distribution Approximations 2 

7.2.1 Binomial, Poisson, gamma, and Gaussian distributions 

The Poisson approximation to the binomial distribution 

The following approximation is a classical one. We wish to show that for small p and sufficiently large n 



P{X = k) = C{n, k)p k {l-p) 
Suppose p = ji/n with n large and (x/n < 1. Then, 



i-k _ -np^P 
fc! 



(7.24) 



P (X = k) = C (n, k) (,/nf(l - „/„)-* = " (" - 1) • • • (" - * + 1) A _ g 

n K V n/ V nJ k\ 



1- -I Vv (7.25) 



2 This content is available online at <http://cnx.Org/content/m23313/l.7/>. 
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The first factor in the last expression is the ratio of polynomials in n of the same degree k, which must 
approach one as n becomes large. The second factor approaches one as n becomes large. According to a well 
known property of the exponential 

1 _ E.\ n _> e -M as n _> oo (7.26) 

n) 

k 

The result is that for large n, P (X = k) w e"^^, where \x = np. 
The Poisson and gamma distributions 

Suppose Y ~ Poisson (At). Now X ~ gamma (a, A) iff 

P (X < t) = ^- /" ^"V** dx=-^— f (Xxr^e-^ d (Ax) (7.27) 



1 /-At 



u a - l e- u du (7.28) 



r (a) 7o 
A well known definite integral, obtained by integration by parts, is 

/•oo n-l k 

/ t n_1 e-' dt = T (n) e^ a ^ — with r (n) = (n - 1) 

Noting that 1 = e~ a e a = e~ a J2T=o IT we ^ n( ^ & fter some simple algebra that 

-i- f , t n - 1 e- t dt = e- a y^r 
r (n) 7o ^ *! 

For a = At and a = n, we have the following equality iff X ~ gamma (a, A). 



Now 



/.■ 



(7.29) 



(7.30) 



m£,, = ii>£ V,<ru<i " = e ~ i '£^ (7 ' 31) 

fc— n 



P(Y >n) = e~ xt Y, ^-jj- iff Y ~ Poisson (At) (7.32) 

fc= n 

The gaussian (normal) approximation 

The central limit theorem, referred to in the discussion of the gaussian or normal distribution above, 
suggests that the binomial and Poisson distributions should be approximated by the gaussian. The number 
of successes in n trials has the binomial (n,p) distribution. This random variable may be expressed 

n 

X = 2_\ lEi where the 7^ constitute an independent class (7.33) 

»=i 

Since the mean value of X is np and the variance is npq, the distribution should be approximately 
N (np, npq) . 
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Gaussian approximation to the binomial: n = 300, p = 0.1 




Figure 7.8: Gaussian approximation to the binomial. 



Use of the generating function shows that the sum of independent Poisson random variables is Poisson. 
Now if X ~ Poisson (/i), then X may be considered the sum of n independent random variables, each Poisson 
(n/ri). Since the mean value and the variance are both /i, it is reasonable to suppose that suppose that X is 
approximately N (fj,, y). 

It is generally best to compare distribution functions. Since the binomial and Poisson distributions 
are integer-valued, it turns out that the best gaussian approximaton is obtained by making a "continuity 
correction." To get an approximation to a density for an integer-valued random variable, the probability at 
t = k is represented by a rectangle of height p^ and unit width, with k as the midpoint. Figure 1 shows a 
plot of the "density" and the corresponding gaussian density for n = 300, p = 0.1. It is apparent that the 
gaussian density is offset by approximately 1/2. To approximate the probability X < k, take the area under 
the curve from k + 1/2; this is called the continuity correction. 

Use of m-procedures to compare 

We have two m-procedures to make the comparisons. First, we consider approximation of the 
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Figure 7.9: Gaussian approximation to the Poisson distribution function [i — 10. 



178 



CHAPTER 7. DISTRIBUTION AND DENSITY FUNCTIONS 



1 
0.9- 
0.8- 
0.7- 
0.6- 
0.5- 
0.4- 
0.3- 
0.2- 
0.1 



80 



Gaussian approximation to the Poisson distribution: mu = 100 
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Figure 7.10: Gaussian approximation to the Poisson distribution function /x = 100. 



Poisson (/i) distribution. The m-procedure poissapp calls for a value of /n, selects a suitable range about 
k = n and plots the distribution function for the Poisson distribution (stairs) and the normal (gaussian) 
distribution (dash dot) for N (/x, fi). In addition, the continuity correction is applied to the gaussian distribu- 
tion at integer values (circles). Figure 7.10 shows plots for /i = 10. It is clear that the continuity correction 
provides a much better approximation. The plots in Figure 7.11 are for /i = 100. Here the continuity 
correction provides the better approximation, but not by as much as for the smaller /i. 
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Gaussian approximation to the Poisson distribution: mu = 1 00 
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Figure 7.11: Poisson and Gaussian approximation to the binomial: n = 1000, p = 0.03. 
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Approximation of Binomial by Poisson and Gaussian 




25 30 

t values 



Figure 7.12: Poisson and Gaussian approximation to the binomial: n = 50, p = 0.6. 



The m-procedure bincomp compares the binomial, gaussian, and Poisson distributions. It calls for values 
of n and p, selects suitable k values, and plots the distribution function for the binomial, a continuous 
approximation to the distribution function for the Poisson, and continuity adjusted values of the gaussian 
distribution function at the integer values. Figure 7.11 shows plots for n = 1000, p = 0.03. The good 
agreement of all three distribution functions is evident. Figure 7.12 shows plots for n = 50, p = 0.6. There is 
still good agreement of the binomial and adjusted gaussian. However, the Poisson distribution does not track 
very well. The difficulty, as we see in the unit Variance (Section 12.1), is the difference in variances — npq 
for the binomial as compared with np for the Poisson. 



7.2.2 Approximation of a real random variable by simple random variables 

Simple random variables play a significant role, both in theory and applications. In the unit Random 
Variables (Section 6.1), we show how a simple random variable is determined by the set of points on the 
real line representing the possible values and the corresponding set of probabilities that each of these values 
is taken on. This describes the distribution of the random variable and makes possible calculations of event 
probabilities and parameters for the distribution. 

A continuous random variable is characterized by a set of possible values spread continuously over an 
interval or collection of intervals. In this case, the probability is also spread smoothly. The distribution is 
described by a probability density function, whose value at any point indicates "the probability per unit 
length" near the point. A simple approximation is obtained by subdividing an interval which includes the 
range (the set of possible values) into small enough subintervals that the density is approximately constant 
over each subinterval. A point in each subinterval is selected and is assigned the probability mass in its 
subinterval. The combination of the selected points and the corresponding probabilities describes the dis- 
tribution of an approximating simple random variable. Calculations based on this distribution approximate 
corresponding calculations on the continuous distribution. 
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Before examining a general approximation procedure which has significant consequences for later treat- 
ments, we consider some illustrative examples. 

Example 7.10: Simple approximation to Poisson 

A random variable with the Poisson distribution is unbounded. However, for a given parameter value 
/j,, the probability for k > n, n sufficiently large, is negligible. Experiment indicates n = fj, + Q^/JI 
(i.e., six standard deviations beyond the mean) is a reasonable value for 5 < fi < 200. 

> mu = [5 10 20 30 40 50 70 100 150 200] ; 
3> K = zeros (1, length (mu) ) ; 
3> p = zeros (1, length (mu) ) ; 
3> for i = l:length(mu) 

K(i) = floor(mu(i)+ 6*sqrt (mu(i) )) ; 
p(i) = cpoisson(mu(i) ,K(i) ) ; 
end 
> disp([mu;K;p*le6] ') 

'/. Residual probabilities are 0.000001 
'/, times the numbers in the last column. 
°/. K is the value of k needed to achieve 
°/. the residual shown. 
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An m-procedure for discrete approximation 

If X is bounded, absolutely continuous with density functon fx, the m-procedure tappr sets up the 
distribution for an approximating simple random variable. An interval containing the range of X is divided 
into a specified number of equal subdivisions. The probability mass for each subinterval is assigned to 
the midpoint. If dx is the length of the subintervals, then the integral of the density function over the 
subinterval is approximated by fx (U) dx. where t; is the midpoint. In effect, the graph of the density over 
the subinterval is approximated by a rectangle of length dx and height fx (ti). Once the approximating 
simple distribution is established, calculations are carried out as for simple random variables. 



A numerical example 

= 3t 2 , < t < 1. Determine P (0.2 < X < 0.9). 



Example 7.11: 

Suppose f x (t) = 
SOLUTION 

In this case, an analytical solution is easy. Fx (t) = t 3 on the interval [0, 1], so 
P = 0.9 3 - 0.2 3 = 0.7210. We use tappr as follows: 

> tappr 
Enter matrix [a b] of x-range endpoints [0 1] 
Enter number of x approximation points 200 
Enter density as a function of t 3*t.~2 
Use row matrices X and PX as in the simple case 

> M = (X >= 0.2)&(X <= 0.9); 

> p = M*PX' 
p = 0.7210 

Because of the regularity of the density and the number of approximation points, the result agrees quite well 
with the theoretical value. 
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The next example is a more complex one. In particular, the distribution is not bounded. However, it is 
easy to determine a bound beyond which the probability is negligible. 



Distribution Function 




Figure 7.13: Distribution function for Example 7.12 (Radial tire mileage) 



Example 7.12: Radial tire mileage 

The life (in miles) of a certain brand of radial tires may be represented by a random variable X 
with density 



fx (t) = { 



t 2 /a 3 for < t < a 

{b/a) e~ fc ('~ a ) for a < t 
where a = 40, 000, b = 20/3, and k = 1/4000. Determine P (X > 45,000). 



> a = 40000; 

> b = 20/3; 

> k = 1/4000; 

3> 7. Test shows cutoff point of 80000 should be satisfactory 

3> tappr 

Enter matrix [a b] of x-range endpoints [0 80000] 

Enter number of x approximation points 80000/20 

Enter density as a function of t (t . ~2/a~3) . *(t < 40000) + ... 

(b/a)*exp(k*(a-t)) .*(t >= 40000) 

Use row matrices X and PX as in the simple case 

> P = (X >= 45000) *PX' 

P = 0.1910 '/. Theoretical value = (2/3)exp(-5/4) = 0.191003 

> cdbn 

Enter row matrix of VALUES X 

Enter row matrix of PROBABILITIES PX '/. See Figure~7.14 for plot 



(7.34) 
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In this case, we use a rather large number of approximation points. As a consequence, the results 
are quite accurate. In the single- variable case, designating a large number of approximating points 
usually causes no computer memory problem. 

The general approximation procedure 

We show now that any bounded real random variable may be approximated as closely as desired by 
a simple random variable (i.e., one having a finite set of possible values). For the unbounded case, the 
approximation is close except in a portion of the range having arbitrarily small total probability. 

We limit our discussion to the bounded case, in which the range of X is limited to a bounded interval 
/ = [a, b]. Suppose I is partitioned into n subintervals by points <;,-, 1 < % < n — 1, with a = to and b = t n . 
Let Mi = [U-i,ti) be the ith subinterval, 1 < i < n— 1 and M n = [t n -i,t n ] (see Figure 7.14). Now random 
variable X may map into any point in the interval, and hence into any point in each subinterval M;. Let 
Ei = X -1 (Mi) be the set of points mapped into M; by X. Then the E; form a partition of the basic space 
$7. For the given subdivision, we form a simple random variable X s as follows. In each subinterval, pick a 
point Si,ti_i < Si < ti. Consider the simple random variable X s = ~^2 i=1 SilE t - 

a b 
[ tf M M — M ] 

M l M 2 M 3 M n 



Vl M i t| 



Figure 7.14: Partition of the interval I including the range of X 
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Figure 7.15: Refinement of the partition by additional subdividion points. 



This random variable is in canonical form. If u> € Ei, then X (u>) s Mi and X s (u>) = Sj. Now the 
absolute value of the difference satisfies 

\X (w) - X s (w) | < U - U-i the length of subinterval M, (7.35) 
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Since this is true for each u> and the corresponding subinterval, we have the important fact 

\X (w) - X s (w) | < maximum length of the M t (7.36) 

By making the subintervals small enough by increasing the number of subdivision points, we can make the 
difference as small as we please. 

While the choice of the s,- is arbitrary in each M,-, the selection of Sj = £,_i (the left-hand endpoint) leads 
to the property X s (u>) < X (w) Voj. In this case, if we add subdivision points to decrease the size of some or 
all of the Mj, the new simple approximation Y s satisfies 

X s (w) < V s (w) <X{uj) V cj (7.37) 

To see this, consider i* e M; (see Figure 7.15). M,- is partitioned into M\ \j ' M' t ' and E; is partitioned into 
El V £^'. X maps E\ into Mj and E^' into M,". Y s maps E\ into t,- and maps E^' into t* > ij. X s maps both 
^ and E^ into <;,-. Thus, the asserted inequality must hold for each u) By taking a sequence of partitions 
in which each succeeding partition refines the previous (i.e. adds subdivision points) in such a way that 
the maximum length of subinterval goes to zero, we may form a nondecreasing sequence of simple random 
variables X n which increase to X for each ui. 

The latter result may be extended to random variables unbounded above. Simply let JV th set of subdi- 
vision points extend from a to JV, making the last subinterval [N, oo). Subintervals from a to JV are made 
increasingly shorter. The result is a nondecreasing sequence {Xjy : 1 < JV} of simple random variables, with 
Xn (w) — > X (w) as N — > oo, for each u> s CI. 

For probability calculations, we simply select an interval I large enough that the probability outside I is 
negligible and use a simple approximation over I. 

7.3 Problems on Distribution and Density Functions 3 

Exercise 7.1 (Solution on p. 189.) 

(See Exercises 3 (Exercise 6.3) and 4 (Exercise 6.4) from "Problems on Random Variables and 
Probabilities"). The class {Cj : 1 < j < 10} is a partition. Random variable X has values 
{1,3,2,3,4,2,1,3,5,2} on d through C 10 , respectively, with probabilities 0.08, 0.13, 0.06, 0.09, 
0.14, 0.11, 0.12, 0.07, 0.11, 0.09. Determine and plot the distribution function F x . 

Exercise 7.2 (Solution on p. 189.) 

(See Exercise 6 (Exercise 6.6) from "Problems on Random Variables and Probabilities"). A store 
has eight items for sale. The prices are $3.50, $5.00, $3.50, $7.50, $5.00, $5.00, $3.50, and $7.50, 
respectively. A customer comes in. She purchases one of the items with probabilities 0.10, 0.15, 
0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing the amount of her purchase may 
be written 

X = 3.57 Cl + 5.0/c 2 + 3.5/c 3 + 7.5/ C4 + 5.0/ Cr , + 5.0/c 6 + 3.5/ C7 + 7.5/ Cs (7.38) 

Determine and plot the distribution function for X. 

Exercise 7.3 (Solution on p. 189.) 

(See Exercise 12 (Exercise 6.12) from "Problems on Random Variables and Probabilities"). The 
class {A, B, C, D} has minterm probabilities 

pm = 0.001 * [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] (7.39) 

Determine and plot the distribution function for the random variable X = Ia + Ib + Ic + Id, 
which counts the number of the events which occur on a trial. 



3 This content is available online at <http://cnx.Org/content/m24209/l.4/>. 
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Exercise 7.4 (Solution on p. 189.) 

Suppose a is a ten digit number. A wheel turns up the digits through 9 with equal probability 
on each spin. On ten spins what is the probability of matching, in order, k or more of the ten digits 
in a, < k < 10? Assume the initial digit may be zero. 

Exercise 7.5 (Solution on p. 189.) 

In a thunderstorm in a national park there are 127 lightning strikes. Experience shows that the 
probability of of a lightning strike starting a fire is about 0.0083. What is the probability that k 
fires are started, k = 0, 1, 2,3? 

Exercise 7.6 (Solution on p. 189.) 

A manufacturing plant has 350 special lamps on its production lines. On any day, each lamp 
could fail with probability p = 0.0017. These lamps are critical, and must be replaced as quickly as 
possible. It takes about one hour to replace a lamp, once it has failed. What is the probability that 
on any day the loss of production time due to lamp failaures is k or fewer hours, k = 0, 1, 2, 3, 4, 5? 

Exercise 7.7 (Solution on p. 189.) 

Two hundred persons buy tickets for a drawing. Each ticket has probability 0.008 of winning. 
What is the probability of k or fewer winners, k = 2, 3, 4? 

Exercise 7.8 (Solution on p. 189.) 

Two coins are flipped twenty times. What is the probability the results match (both heads or both 
tails) k times, < k < 20? 

Exercise 7.9 (Solution on p. 189.) 

Thirty members of a class each flip a coin ten times. What is the probability that at least five of 
them get seven or more heads? 

Exercise 7.10 (Solution on p. 189.) 

For the system in Exercise 7.6, call a day in which one or more failures occur among the 350 lamps a 
"service day." Since a Bernoulli sequence "starts over" at any time, the sequence of service/nonservice 
days may be considered a Bernoulli sequence with probability pi, the probability of one or more 
lamp failures in a day. 

a. Beginning on a Monday morning, what is the probability the first service day is the first, 
second, third, fourth, fifth day of the week? 

b. What is the probability of no service days in a seven day week? 

Exercise 7.11 (Solution on p. 190.) 

For the system in Exercise 7.6 and Exercise 7.10 assume the plant works seven days a week. What 
is the probability the third service day occurs by the end of 10 days? Solve using the negative 
binomial distribution; repeat using the binomial distribution. 

Exercise 7.12 (Solution on p. 190.) 

A residential College plans to raise money by selling "chances" on a board. Fifty chances are sold. 
A player pays $10 to play; he or she wins $30 with probability p = 0.2. The profit to the College is 

X = 50 • 10 — 30N, where N is the number of winners (7.40) 

Determine the distribution for X and calculate P (X > 0), P (X > 200), and 
P(X > 300). 

Exercise 7.13 (Solution on p. 190.) 

A single six-sided die is rolled repeatedly until either a one or a six turns up. What is the probability 
that the first appearance of either of these numbers is achieved by the fifth trial or sooner? 

Exercise 7.14 (Solution on p. 190.) 

Consider a Bernoulli sequence with probability p = 0.53 of success on any component trial. 



186 CHAPTER 7. DISTRIBUTION AND DENSITY FUNCTIONS 

a. The probability the fourth success will occur no later than the tenth trial is determined by 
the negative binomial distribution. Use the procedure nbinom to calculate this probability . 

b. Calculate this probability using the binomial distribution. 

Exercise 7.15 (Solution on p. 190.) 

Fifty percent of the components coming off an assembly line fail to meet specifications for a special 
job. It is desired to select three units which meet the stringent specifications. Items are selected 
and tested in succession. Under the usual assumptions for Bernoulli trials, what is the probability 
the third satisfactory unit will be found on six or fewer trials? 

Exercise 7.16 (Solution on p. 190.) 

The number of cars passing a certain traffic count position in an hour has Poisson (53) distribution. 
What is the probability the number of cars passing in an hour lies between 45 and 55 (inclusive)? 
What is the probability of more than 55? 

Exercise 7.17 (Solution on p. 190.) 

Compare P (X < k) and P (Y < k) for X ~ binomial(5000, 0.001) and Y ~ Poisson (5), for 
< k < 10. Do this directly with ibinom and ipoisson. Then use the m-procedure bincomp to 
obtain graphical results (including a comparison with the normal distribution). 

Exercise 7.18 (Solution on p. 191.) 

Suppose X ~ binomial (12, 0.375), Y ~ Poisson (4.5), and Z ~ exponential (1/4.5). For each 
random variable, calculate and tabulate the probability of a value at least k, for integer values 
3 < k < 8. 

Exercise 7.19 (Solution on p. 191.) 

The number of noise pulses arriving on a power circuit in an hour is a random quantity having 
Poisson (7) distribution. What is the probability of having at least 10 pulses in an hour? What is 
the probability of having at most 15 pulses in an hour? 

Exercise 7.20 (Solution on p. 191.) 

The number of customers arriving in a small specialty store in an hour is a random quantity having 
Poisson (5) distribution. What is the probability the number arriving in an hour will be between 
three and seven, inclusive? What is the probability of no more than ten? 

Exercise 7.21 (Solution on p. 191.) 

Random variable X ~ binomial (1000, 0.1). 

a. Determine P (X > 80) , P {X > 100) , P (X > 120) 

b. Use the appropriate Poisson distribution to approximate these values. 

Exercise 7.22 (Solution on p. 191.) 

The time to failure, in hours of operating time, of a televesion set subject to random voltage surges 
has the exponential (0.002) distribution. Suppose the unit has operated successfully for 500 hours. 
What is the (conditional) probability it will operate for another 500 hours? 

Exercise 7.23 (Solution on p. 191.) 

For X ~ exponential (A), determine P (X > 1/A), P (X > 2/A). 

Exercise 7.24 (Solution on p. 191.) 

Twenty "identical" units are put into operation. They fail independently. The times to failure 
(in hours) form an iid class, exponential (0.0002). This means the "expected" life is 5000 hours. 
Determine the probabilities that at least k, for k = 5, 8, 10,12,15, will survive for 5000 hours. 

Exercise 7.25 (Solution on p. 191.) 

Let T ~ gamma (20, 0.0002) be the total operating time for the units described in Exercise 7.24. 

a. Use the m-function for the gamma distribution to determine P (T < 100, 000). 

b. Use the Poisson distribution to determine P (T < 100, 000). 
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Exercise 7.26 (Solution on p. 191.) 

The sum of the times to failure for five independent units is a random variable X ~ gamma 
(5, 0.15). Without using tables or m-programs, determine P (X < 25). 

Exercise 7.27 (Solution on p. 192.) 

Interarrival times (in minutes) for fax messages on a terminal are independent, exponential (A = 
0.1). This means the time X for the arrival of the fourth message is gamma(4, 0.1). Without using 
tables or m-programs, utilize the relation of the gamma to the Poisson distribution to determine 
P(X<30). 

Exercise 7.28 (Solution on p. 192.) 

Customers arrive at a service center with independent interarrival times in hours, which have 
exponential (3) distribution. The time X for the third arrival is thus gamma (3, 3). Without using 
tables or m-programs, determine P (X < 2). 

Exercise 7.29 (Solution on p. 192.) 

Five people wait to use a telephone, currently in use by a sixth person. Suppose time for the six 
calls (in minutes) are iid, exponential (1/3). What is the distribution for the total time Z from the 
present for the six calls? Use an appropriate Poisson distribution to determine P (Z < 20). 

Exercise 7.30 (Solution on p. 192.) 

A random number generator produces a sequence of numbers between and 1. Each of these 
can be considered an observed value of a random variable uniformly distributed on the interval [0, 
1]. They assume their values independently. A sequence of 35 numbers is generated. What is the 
probability 25 or more are less than or equal to 0.71? (Assume continuity. Do not make a discrete 
adjustment.) 

Exercise 7.31 (Solution on p. 192.) 

Five "identical" electronic devices are installed at one time. The units fail independently, and the 
time to failure, in days, of each is a random variable exponential (1/30). A maintenance check 
is made each fifteen days. What is the probability that at least four are still operating at the 
maintenance check? 

Exercise 7.32 (Solution on p. 192.) 

Suppose X ~ 7V (4, 81). That is, X has gaussian distribution with mean /j, = 4 and variance 
a 2 = 81. 

a. Use a table of standardized normal distribution to determine P (2 < X < 8) and 
P(\X-4\<5). 

b. Calculate the probabilities in part (a) with the m-function gaussian. 

Exercise 7.33 (Solution on p. 192.) 

Suppose X ~ TV (5, 81). That is, X has gaussian distribution with /i = 5 and a 2 = 81. Use a table 
of standardized normal distribution to determine P (3 < X < 9) and P (\X — 5| < 5). Check your 
results using the m-function gaussian. 

Exercise 7.34 (Solution on p. 193.) 

Suppose X ~ TV (3, 64). That is, X has gaussian distribution with \x = 3 and a 2 = 64. Use a table 
of standardized normal distribution to determine P (1 < X < 9) and P (\X — 3| < 4). Check your 
results with the m-function gaussian. 

Exercise 7.35 (Solution on p. 193.) 

Items coming off an assembly line have a critical dimension which is represented by a random 
variable ~ N(10, 0.01). Ten items are selected at random. What is the probability that three or 
more are within 0.05 of the mean value [i. 

Exercise 7.36 (Solution on p. 193.) 

The result of extensive quality control sampling shows that a certain model of digital watches 
coming off a production line have accuracy, in seconds per month, that is normally distributed with 
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H = 5 and a 2 = 300. To achieve a top grade, a watch must have an accuracy within the range of 
-5 to +10 seconds per month. What is the probability a watch taken from the production line to 
be tested will achieve top grade? Calculate, using a standardized normal table. Check with the 
m-function gaussian. 

Exercise 7.37 (Solution on p. 193.) 

Use the m-procedure bincomp with various values of n from 10 to 500 and p from 0.01 to 0.7, to 
observe the approximation of the binomial distribution by the Poisson. 

Exercise 7.38 (Solution on p. 193.) 

Use the m-procedure poissapp to compare the Poisson and gaussian distributions. Use various 
values of \i from 10 to 500. 

Exercise 7.39 (Solution on p. 193.) 

Random variable X has density fx (t) = ft 2 , — 1 < t < 1 (and zero elsewhere). 

a. Determine P (-0.5 < X < 0, 8), P (\X\ > 0.5), P (\X - 0.25| < 0.5). 

b. Determine an expression for the distribution function. 

c. Use the m-procedures tappr and cdbn to plot an approximation to the distribution function. 

Exercise 7.40 (Solution on p. 194.) 

Random variable X has density function fx (t) = t — |t 2 , < t < 2 (and zero elsewhere). 

a. Determine P {X < 0.5), P{0.5 <X< 1.5), P (\X - 1| < 1/4). 

b. Determine an expression for the distribution function. 

c. Use the m-procedures tappr and cdbn to plot an approximation to the distribution function. 

Exercise 7.41 (Solution on p. 194.) 

Random variable X has density function 

(6/5) t 2 for < t < 1 6 , 6 

/**={,'/ , " " = I [0, 1] (t) ~t 2 + / (1 , 2] (t) - 2 - t) (7.41) 

(6/5) (2 - t) for 1< t < 2 5 5 

a. Determine P {X < 0.5), P{0.5 <X< 1.5), P (\X - 1| < 1/4). 

b. Determine an expression for the distribution function. 

c. Use the m-procedures tappr and cdbn to plot an approximation to the distribution function. 
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Solutions to Exercises in Chapter 7 

Solution to Exercise 7.1 (p. 184) 



T = [132342135 2]; 
pc = 0.01* [8 13 6 9 14 11 12 7 11 9] ; 
[X,PX] = csort(T,pc) ; 
ddbn 

Enter row matrix of VALUES X 
Enter row matrix of PROBABILITIES PX 

Solution to Exercise 7.2 (p. 184) 

T = [3.5 5 3.5 7.5 5 5 3.5 7.5]; 
pc = 0.01* [10 15 15 20 10 5 10 15]; 
[X,PX] = csort(T,pc) ; 
ddbn 

Enter row matrix of VALUES X 
Enter row matrix of PROBABILITIES PX 

Solution to Exercise 7.3 (p. 184) 



'/. See MATLAB plot 



'/. See MATLAB plot 



npr06_12 (Section~17. 8 . 28: npr06_12) 
Minterm probabilities in pm, coefficients in c 

T = sum(mintable(4)) ; '/, Alternate solution. See Exercise 12 (Exercise~6 . 12) from "Problems on Random V: 
[X,PX] = csort(T,pm) ; 
ddbn 

Enter row matrix of VALUES X 
Enter row matrix of PROBABILITIES PX '/, See MATLAB plot 

Solution to Exercise 7.4 (p. 185) 

P = cbinom (10, 0.1,0 : 10). 
Solution to Exercise 7.5 (p. 185) 

P = ibinom(127, 0.0083, 0:3) P = 0.3470 0.3688 0.1945 0.0678 
Solution to Exercise 7.6 (p. 185) 

P = 1 - cbinom(350,0.0017,l:6) 



= 0.5513 



0.8799 



0.9775 



0. 



0.9996 



1.0000 



Solution to Exercise 7.7 (p. 185) 

P = 1 - cbinom(200,0.008,3:5) = 0.7838 0.9220 0.9768 
Solution to Exercise 7.8 (p. 185) 

P = ibinom(20, 1/2, 0:20) 
Solution to Exercise 7.9 (p. 185) 

p = cbinom(10,0.5,7) = 0.1719 

P = cbinom(30,p,5) = 0.6052 

Solution to Exercise 7.10 (p. 185) 

pi = 1 - (1 - 0.0017)^350 = 0.4487 k = 1:5; (prob given day is a service day) 

a. 
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b. 



P = pl*(l - pl).-(k-l) = 0.4487 0.2474 0.1364 0.0752 0.0414 



P0 = (1 - pl)~7 = 0.0155 



Solution to Exercise 7.11 (p. 185) 

pi = 1 - (1 - 0.0017)^350 = 0.4487 

• P = sum(nbinom(3,pl,3:10)) = 0.8990 

• Pa = cbinom(10,pl,3) = 0.8990 

Solution to Exercise 7.12 (p. 185) 



N = 0:50; 
PN = ibinom(50, 0.2, 0:50); 
X = 500 - 30*N; 
Ppos = (X>0)*PN' 
Ppos = 0.9856 
P200 = (X>=200)*PN' 
P200 = 0.5836 
P300 = (X>=300)*PN' 
P300 = 0.1034 

Solution to Exercise 7.13 (p. 185) 
P = 1 - (2/3) ~ 5 = 0.8683 
Solution to Exercise 7.14 (p. 185) 

a. P = sum(nbinom(4,0.53,4:10)) = 0.8729 

b. Pa = cbinom(10,0.53,4) = 0.8729 

Solution to Exercise 7.15 (p. 186) 

P = cbinom(6,0.5,3) = 0.6562 
Solution to Exercise 7.16 (p. 186) 

PI = cpoisson(53,45) - cpoisson(53,56) = 0.5224 

P2 = cpoisson(53,56) = 0.3581 
Solution to Exercise 7.17 (p. 186) 



k = 0:10; 






Pb = 1 - cbinom(5000,0. 


,001,k+l); 


Pp = 1 - cpoisson(5,k+l) ; 
disp([k;Pb;Pp]') 

0.0067 0.0067 


1.0000 


0.0404 


. 0404 


2.0000 


0.1245 


0.1247 


3.0000 


0.2649 


0.2650 


4.0000 


0.4404 


0.4405 


5.0000 


0.6160 


0.6160 


6.0000 


0.7623 


0.7622 


7.0000 


0.8667 


0.8666 


8.0000 


0.9320 


0.9319 


9.0000 


0.9682 


0.9682 


10.0000 


0.9864 


0.9863 
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bincomp 
Enter the parameter n 5000 
Enter the parameter p 0.001 
Binomial-- stairs 
Poisson-- -.-. 
Adjusted Gaussian-- o o o 
gtext( 'Exercise 17') 

Solution to Exercise 7.18 (p. 186) 



k = 3:8; 
Px = cbinom(12,0.375,k) : 
Py = cpoisson(4. 5,k) ; 
Pz = exp(-k/4.5) ; 
disp([k;Px;Py;Pz] ') 



3.0000 


0.8865 


0. 


,8264 


0, 


,5134 


4.0000 


0.7176 


0. 


.6577 


0. 


,4111 


5.0000 


0.4897 





,4679 


0, 


,3292 


6.0000 


0.2709 


0. 


.2971 


0. 


,2636 


7.0000 


0.1178 


0. 


,1689 


0, 


,2111 


8.0000 


0.0390 


0. 


,0866 


0. 


,1690 



Solution to Exercise 7.19 (p. 186) 

PI = cpoisson(7,10) = 0.1695 P2 = 1 - cpoisson(7,16) = 0.9976 
Solution to Exercise 7.20 (p. 186) 

PI = cpoisson(5,3) - cpoisson(5,8) = 0.7420 

P2 = 1 - cpoisson(5,ll) = 0.9863 
Solution to Exercise 7.21 (p. 186) 

k = [80 100 120] ; 
P = cbinom(1000,0.1,k) 
P = 0.9867 0.5154 0.0220 

PI = cpoisson(100,k) 
PI = 0.9825 0.5133 0.0282 

Solution to Exercise 7.22 (p. 186) 

P {X > 500 + 500|X > 500) = P (X > 500) = e" 0002 ' 500 = 0.3679 
Solution to Exercise 7.23 (p. 186) 

P(X > kX) = er xk ' x = e~ k 
Solution to Exercise 7.24 (p. 186) 

p = exp(-0. 0002*5000) 
p = 0.3679 
k = [5 8 10 12 15] ; 
P = cbinom(20,p,k) 
P = 0.9110 0.4655 0.1601 0.0294 0.0006 

Solution to Exercise 7.25 (p. 186) 

PI = gammadbn (20, 0.0002, 100000) = 0.5297 P2 = cpoisson(0. 0002*100000,20) = 0.5297 
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Solution to Exercise 7.26 (p. 187) 

P {X < 25) = P (Y > 5) , Y ~ poisson (0.15 • 25 = 3.75) (7.42) 

P (Y > 5) = 1 - P (Y < 4) = 1 - e- 335 (l + 3.75 + ^ + ^ + ^-\ = 0.3225 (7.43) 

Solution to Exercise 7.27 (p. 187) 

P {X < 30) = P (Y > 4) , Y ~ poisson (0.2 • 30 = 3) (7.44) 

P (y > 4) = 1 - P (Y < 3) = 1 - e~ 3 ( 1 + 3 + — + — J = 0.3528 (7.45) 

Solution to Exercise 7.28 (p. 187) 

P {X < 2) = P (Y > 3) , y ~ poisson (3-2 = 6) (7.46) 

P (y > 3) = 1 - P (Y < 2) = 1 - e- 6 (1 + 6 + 36/2) = 0.9380 (7.47) 

Solution to Exercise 7.29 (p. 187) 

Z ~ gamma (6,1/3). 

P(Z < 20) = P{Y > 6), y~ poisson (1/3 -20) (7.48) 

P(y > 6) = cpoisson (20/3, 6) = 0.6547 (7.49) 

Solution to Exercise 7.30 (p. 187) 

p = cbinom(35,0.71,25) = 0.5620 
Solution to Exercise 7.31 (p. 187) 

p = exp(-15/30) = 0.6065 P = cbinom(5,p,4) = 0.3483 
Solution to Exercise 7.32 (p. 187) 

a. 

P (2 < X < 8) = $ ((8 - 4) /9) - $ ((2 - 4) /9) = (7.50) 

$ (4/9) + $ (2/9) - 1 = 0.6712 + 0.5875 - 1 = 0.2587 (7.51) 

P(\X - 4| < 5) = 2$ (5/9) - 1 = 1.4212 - 1 = 0.4212 (7.52) 

b. 

PI = gaussian(4,81,8) - gaussian(4,81 ,2) 
PI = 0.2596 

P2 = gaussian(4,81,9) - gaussian(4,84, -1) 
P2 = 0.4181 

Solution to Exercise 7.33 (p. 187) 

P (3 < X < 9) = $ ((9 - 5) /9) - $ ((3 - 5) /9) = $ (4/9) + $ (2/9) - 1 = 0.6712 + (7.53) 
0.5875 - 1 = 0.2587 

P(\X - 5| < 5) = 2$ (5/9) - 1 = 1.4212 - 1 = 0.4212 (7.54) 
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PI = gaussian(5,81 ,9) - gaussian(5,81 ,3) 
PI = 0.2596 

P2 = gaussian(5, 81,10) - gaussian(5,84,0) 
P2 = 0.4181 

Solution to Exercise 7.34 (p. 187) 

P (1 < X < 9) = $ ((9 - 3) /8) - $ ((1 - 3) /9) = (7.55) 

$ (0.75) + $ (0.25) - 1 = 0.7734 + 0.5987 - 1 = 0.3721 (7.56) 

P(\X - 3| < 4) = 2$ (4/8) - 1 = 1.3829 - 1 = 0.3829 (7.57) 



PI = gaussian(3,64,9) - gaussian(3,64, 1) 
PI = 0.3721 

P2 = gaussian(3,64,7) - gaussian(3, 64,-1) 
P2 = 0.3829 

Solution to Exercise 7.35 (p. 187) 

p = gaussian(10,0.01,10.05) - gaussian(10,0 .01 ,9. 95) 
p = 0.3829 
P = cbinom(10,p,3) 
P = 0.8036 

Solution to Exercise 7.36 (p. 187) 

P (-5 < X < 10) = $ (5/7300) + $ (10/7300) - 1 = $ (0.289) + $ (0.577) - 1 = 0.614 + 0.717 - 1 = 0.331 

P = gaussian(5,300, 10) - gaussian (5, 300, -5) = 0.3317 (7.58) 

Solution to Exercise 7.37 (p. 188) 

Experiment with the m-procedure bincomp. 
Solution to Exercise 7.38 (p. 188) 

Experiment with the m-procedure poissapp. 
Solution to Exercise 7.39 (p. 188) 

lft 2 = tV2 (7.59) 



PI = 0.5 * (W 3 - (-0.5) 3 ) = 0.3185 P2 = 2 / -t 2 = (l - (-0.5) 3 ) = 7/8 (7.60) 

P3 = P(\X - 0.25| < 0.5) = P (-0.25 < X < 0.75) = - [(3/4) 3 - (-1/4) 3 ] = 7/32 (7.61) 

b. F x {t) = ! t _ 1 f x = \{e + 1) 
c. 
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tappr 
Enter matrix [a b] of x-range endpoints [-1 1] 
Enter number of x approximation points 200 
Enter density as a function of t 1.5*t.~2 
Use row matrices X and PX as in the simple case 
cdbn 

Enter row matrix of VALUES X 
Enter row matrix of PROBABILITIES PX '/. See MATLAB plot 



Solution to Exercise 7.40 (p. 188) 






(7.62) 



a. 

PI = 0.5 2 /2 - 0.5 3 /8 = 7/64 P2 = 1.5 2 /2 - 1.5 3 /8 - 7/64 = 19/32 P3 = 79/256 

b. F x (t) = £-£, 0<t<2 
c. 

tappr 
Enter matrix [a b] of x-range endpoints [0 2] 
Enter number of x approximation points 200 
Enter density as a function of t t - (3/8)*t.~2 
Use row matrices X and PX as in the simple case 
cdbn 

Enter row matrix of VALUES X 
Enter row matrix of PROBABILITIES PX % See MATLAB plot 

Solution to Exercise 7.41 (p. 188) 

a. 



(7.63) 



PI 



6 



1/2 



f 



P3 



1/20 



P2 



1/2 



.3/2 



(2 - t) = 4/5 



t z + 



3/4 



5/4 



(2 - i) = 79/160 



b. 



Fx(t) 



fx = I[o,i] (t) ^t 3 + / ( i. 2 ] (t) 
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(7.64) 
(7.65) 

(7.66) 



tappr 
Enter matrix [a b] of x-range endpoints [0 2] 
Enter number of x approximation points 400 
Enter density as a function of t (6/5)*(t<=l) . *t . "2 + ... 

(6/5)*(t>l).*(2 - t) 
Use row matrices X and PX as in the simple case 
cdbn 

Enter row matrix of VALUES X 
Enter row matrix of PROBABILITIES PX % See MATLAB plot 



Chapter 8 

Random Vectors and joint Distributions 

8.1 Random Vectors and Joint Distributions 1 

8.1.1 Introduction 

A single, real-valued random variable is a function (mapping) from the basic space 0, to the real line. That 
is, to each possible outcome u> of an experiment there corresponds a real value t = X (u>). The mapping 
induces a probability mass distribution on the real line, which provides a means of making probability 
calculations. The distribution is described by a distribution function Fx- In the absolutely continuous case, 
with no point mass concentrations, the distribution may also be described by a probability density function 
fx- The probability density is the linear density of the probability mass along the real line (i.e., mass per unit 
length). The density is thus the derivative of the distribution function. For a simple random variable, the 
probability distribution consists of a point mass p; at each possible value t; of the random variable. Various 
m-procedures and m-functions aid calculations for simple distributions. In the absolutely continuous case, 
a simple approximation may be set up, so that calculations for the random variable are approximated by 
calculations on this simple distribution. 

Often we have more than one random variable. Each can be considered separately, but usually they have 
some probabilistic ties which must be taken into account when they are considered jointly. We treat the 
joint case by considering the individual random variables as coordinates of a random vector. We extend the 
techniques for a single random variable to the multidimensional case. To simplify exposition and to keep 
calculations manageable, we consider a pair of random variables as coordinates of a two-dimensional random 
vector. The concepts and results extend directly to any finite number of random variables considered jointly. 

8.1.2 Random variables considered jointly; random vectors 

As a starting point, consider a simple example in which the probabilistic interaction between two random 
quantities is evident. 

Example 8.1: A selection problem 

Two campus jobs are open. Two juniors and three seniors apply. They seem equally qualified, so it 
is decided to select them by chance. Each combination of two is equally likely. Let X be the number 
of juniors selected (possible values 0, 1, 2) and Y be the number of seniors selected (possible values 
0, 1, 2). However there are only three possible pairs of values for (X, Y) : (0, 2) , (1, 1), or (2, 0). 
Others have zero probability, since they are impossible. Determine the probability for each of the 
possible pairs. 



1 This content is available online at <http://cnx.Org/content/m23318/l.7/>. 
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SOLUTION 

There are C (5, 2) = 10 equally likely pairs. Only one pair can be both juniors. Six pairs can be 
one of each. There are C (3, 2) = 3 ways to select pairs of seniors. Thus 

P(X = 0, Y = 2) = 3/10, P(X = 1, Y = 1) =6/10, P (X = 2, Y = 0) = 1/10 (8.1) 

These probabilities add to one, as they must, since this exhausts the mutually exclusive possibilities. 
The probability of any other combination must be zero. We also have the distributions for the 
random variables conisidered individually. 

X = [0 1 2] PX= [3/10 6/10 1/10] Y = [0 1 2] PY = [1/10 6/10 3/10] (8.2) 

We thus have a joint distribution and two individual or marginal distributions. 

We formalize as follows: 

A pair {X, Y} of random variables considered jointly is treated as the pair of coordinate functions for 
a two-dimensional random vector W = (X, Y). To each u> € O, W assigns the pair of real numbers (t, u), 
where X (u>) = t and Y (w) = u. If we represent the pair of values {t, u} as the point (t, u) on the plane, 
then W (u>) = (t, u), so that 

W=(X,Y): Cl^K 2 (8.3) 

is a mapping from the basic space Q, to the plane R 2 . Since W is a function, all mapping ideas extend. The 
inverse mapping W~ l plays a role analogous to that of the inverse mapping X~ l for a real random variable. 
A two-dimensional vector W is a random vector iff W" 1 (Q) is an event for each reasonable set (technically, 
each Borel set) on the plane. 

A fundamental result from measure theory ensures 

W = (X, Y) is a random vector iff each of the coordinate functions X and Y is a random variable. 

In the selection example above, we model X (the number of juniors selected) and Y (the number of 
seniors selected) as random variables. Hence the vector-valued function 

8.1.3 Induced distribution and the joint distribution function 

In a manner parallel to that for the single- variable case, we obtain a mapping of probability mass from the 
basic space to the plane. Since W~ l (Q) is an event for each reasonable set Q on the plane, we may assign 
to Q the probability mass 

Pxy (Q) = P [W- 1 (Q)] = P [(X, Y)- 1 (Q)] (8.4) 

Because of the preservation of set operations by inverse mappings as in the single-variable case, the mass 
assignment determines Pxy as a probability measure on the subsets of the plane R 2 . The argument parallels 
that for the single- variable case. The result is the probability distribution induced by W = (X, Y). To 
determine the probability that the vector-valued function W = (A, Y) takes on a (vector) value in region 
Q, we simply determine how much induced probability mass is in that region. 

Example 8.2: Induced distribution and probability calculations 

To determine P(l < X < 3, Y > 0), we determine the region for which the first coordinate value 
(which we call t) is between one and three and the second coordinate value (which we call u) is 
greater than zero. This corresponds to the set Q of points on the plane with 1 < t < 3 and u > 0. 
Gometrically, this is the strip on the plane bounded by (but not including) the horizontal axis and 
by the vertical lines t = 1 and t = 3 (included) . The problem is to determine how much probability 
mass lies in that strip. How this is acheived depends upon the nature of the distribution and how 
it is described. 

As in the single- variable case, we have a distribution function. 
Definition 



The joint distribution function Fxy for W = (X, Y) is given by 

F XY (t, u) = P(X<t, Y <u) \/{t,u)eH 2 
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(8.5) 



This means that Fxy {t, u) is equal to the probability mass in the region Q tu on the plane such that the 
first coordinate is less than or equal to t and the second coordinate is less than or equal to u. Formally, we 
may write 



F XY {t, u)=P [{X, Y) e Qtu] , where Q tu = {(r, s) : r <t, s < u} 



(8.6) 



Now for a given point (a, b), the region Q a & is the set of points (t, u) on the plane which are on or to the left 
of the vertical line through (t, 0)and on or below the horizontal line through (0, u) (see Figure 1 for specific 
point t = a,u = b). We refer to such regions as semiinfinite intervals on the plane. 

The theoretical result quoted in the real variable case extends to ensure that a distribution on the 
plane is determined uniquely by consistent assignments to the semiinfinite intervals Qtu- Thus, the induced 
distribution is determined completely by the joint distribution function. 



F XY (a, b) = P XY (Qab) 




Figure 8.1: The region Q a b for the value Fxy («,&)■ 



Distribution function for a discrete random vector 

The induced distribution consists of point masses. At point (ti, Uj) in the range of W = (X, Y) there 
is probability mass pij = P[W = (ti, Uj)] = P (X = ti, Y = Uj). As in the general case, to determine 
P [(X, Y) e Q] we determine how much probability mass is in the region. In the discrete case (or in any 
case where there are point mass concentrations) one must be careful to note whether or not the boundaries 
are included in the region, should there be mass concentrations on the boundary. 
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U 



0.3 
3/10 



0.9 





1 
1 

o ; o.6 


1 


16/10 




o ; o 





i 




o |i 



1.0 



0.7 



0.1 
1/10 



Figure 8.2: The joint distribution for Example 8.3 (Distribution function for the selection problem in 
Example 8.1 (A selection problem)). 



Example 8.3: Distribution function for the selection problem in Example 8.1 (A selec- 
tion problem) 

The probability distribution is quite simple. Mass 3/10 at (0,2), 6/10 at (1,1), and 1/10 at (2,0). 
This distribution is plotted in Figure 8.2. To determine (and visualize) the joint distribution 
function, think of moving the point (t, u) on the plane. The region Q tu is a giant "sheet" with 
corner at (t, u). The value of Fxy (t, u) is the amount of probability covered by the sheet. This 
value is constant over any grid cell, including the left-hand and lower boundariies, and is the value 
taken on at the lower left-hand corner of the cell. Thus, if (t, u) is in any of the three squares on 
the lower left hand part of the diagram, no probability mass is covered by the sheet with corner in 
the cell. If (t, u) is on or in the square having probability 6/10 at the lower left-hand corner, then 
the sheet covers that probability, and the value of Fxy (t, u) = 6/10. The situation in the other 
cells may be checked out by this procedure. 

Distribution function for a mixed distribution 

Example 8.4: A mixed distribution 

The pair {X, Y} produces a mixed distribution as follows (see Figure 8.3) 
Point masses 1/10 at points (0,0), (1,0), (1,1), (0,1) 
Mass 6/10 spread uniformly over the unit square with these vertices 
The joint distribution function is zero in the second, third, and fourth quadrants. 

• If the point (t, u) is in the square or on the left and lower boundaries, the sheet covers the 
point mass at (0,0) plus 0.6 times the area covered within the square. Thus in this region 



Fxy (t, u) 



1 

10 



(1 + 6tu) 



1.7) 
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• If the pont (t, u) is above the square (including its upper boundary) but to the left of the line 
t = 1, the sheet covers two point masses plus the portion of the mass in the square to the left 
of the vertical line through (t, u). In this case 

Fxy (t, u) = 1 (2 + 6i) ( 



• If the point (£, u) is to the right of the square (including its boundary) with < u < 1, the 
sheet covers two point masses and the portion of the mass in the square below the horizontal 
line through (t, u), to give 



XY 



(t,u) 



1 

To 



(2 + 6w) 



(8.9) 



• If (t, u) is above and to the right of the square (i.e., both 1 < t and 1 < u). then all probability 
mass is covered and Fxy (t, w) = 1 in this region. 





Point mass 1/10 at each vertex 






i r M.n 




Mass 6/10 spread 
uniformly on the 
square. Density 0.6. 

(t,u) 




o . 















L 



Mass 0.1 + 0.6tu in region covered 
by infinite sheet with corner at (t,u). 

Figure 8.3: Mixed joint distribution for Example 8.4 (A mixed distribution) 



8.1.4 Marginal distributions 

If the joint distribution for a random vector is known, then the distribution for each of the component 
random variables may be determined. These are known as marginal distributions. In general, the converse 
is not true. However, if the component random variables form an independent pair, the treatment in that 
case shows that the marginals determine the joint distribution. 
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To begin the investigation, note that 

Fx (t) = P {X <t) = P (X <t,Y< oo) i.e., Y can take any of its possible values 
Thus 

F x (t) = F XY (t, oo) = Urn F XY (t, u) 

u — *oo 

This may be interpreted with the aid of Figure 8.4. Consider the sheet for point (t, u). 



(8.10) 

(8.11) 



Boundary moves up to 
include all probability 
mass in the half plane. 



u increases without limit 




t 

Fy(t) = probability in the 
half plane = F XY (t,oo) 



Figure 8.4: Construction for obtaining the marginal distribution for X. 



If we push the point up vertically, the upper boundary of Q tu is pushed up until eventually all probability 
mass on or to the left of the vertical line through (t, u) is included. This is the total probability that X < t. 
Now Fx (t) describes probability mass on the line. The probability mass described by Fx (t) is the same 
as the total joint probability mass on or to the left of the vertical line through (t, u). We may think of the 
mass in the half plane being projected onto the horizontal line to give the marginal distribution for X. A 
parallel argument holds for the marginal for Y. 



Fy (u) = P (Y < u) = Fxy (oo, u) = mass on or below horizontal line through (£, u) 

This mass is projected onto the vertical axis to give the marginal distribution for Y. 
Marginals for a joint discrete distribution 



(8.12) 



201 
Consider a joint simple distribution. 

m n 

P(X = ti) =J2 P ( X = U, Y = Uj ) and P (Y = Uj ) = ^P (X = t it Y = Uj ) (8.13) 

3=1 i=l 

Thus, all the probability mass on the vertical line through (t i; 0) is projected onto the point t; on a horizontal 
line to give P (X = ti). Similarly, all the probability mass on a horizontal line through (0, Uj) is projected 
onto the point Uj on a vertical line to give P (Y = Uj). 

Example 8.5: Marginals for a discrete distribution 

The pair {X, Y} produces a joint distribution that places mass 2/10 at each of the five points 

(0,0), (1,1), (2,0), (2,2), (3,1) (See Figure 8.5) 

The marginal distribution for X has masses 2/10, 2/10, 4/10, 2/10 at points t = 0,1,2,3, 
respectively. Similarly, the marginal distribution for Y has masses 4/10, 4/10, 2/10 at points 
u = 0, 1, 2, respectively. 





0.2 | 0.4 


| 0.8 ] 1.0 


2 


r 


"O i 




0.2 | 0.4 


| 0.6 ! 0.8 


1 


o 

0.2 j 0.2 


j 0.4 ! 0.4 







-6 ! 



T- 



12 3 

Joint distribution 



0.2 0.2 0.4 0.2 

-e e e e- 

12 3 

Marginal distribution for X 



Figure 8.5: Marginal distribution for Example 1. 



Example 8.6 

Consider again the joint distribution in Example 8.4 (A mixed distribution). The pair {X, Y} 
produces a mixed distribution as follows: 

Point masses 1/10 at points (0,0), (1,0), (1,1), (0,1) 

Mass 6/10 spread uniformly over the unit square with these vertices 

The construction in Figure 8.6 shows the graph of the marginal distribution function Fx- There 
is a jump in the amount of 0.2 at t = 0, corresponding to the two point masses on the vertical line. 
Then the mass increases linearly with t, slope 0.6, until a final jump at t = 1 in the amount of 0.2 
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produced by the two point masses on the vertical line. At t = 1, the total mass is "covered" and 
Fx it) is constant at one for t > 1. By symmetry, the marginal distribution for Y is the same. 



Point masses 1/10 at each vertex 




Mass 6/10 spread 
uniformly on the 
square. Density 0.6. 



Mass 0.2 + 0.6t covered by the half plane. 



1 
0.8 



0.2 




F Y (t) = 0.2 + 0.6t 



Marginal distribution for X 



Figure 8.6: Marginal distribution for Example 8.6. 



8.2 Random Vectors and MATLAB 2 



8.2,1 m-procedures for a pair of simple random variables 

We examine, first, calculations on a pair of simple random variables X, Y, considered jointly. These are, in 
effect, two components of a random vector W = iX, Y), which maps from the basic space $7 to the plane. 
The induced distribution is on the (t, w)-plane. Values on the horizontal axis (t-axis) correspond to values 



2 This content is available online at <http://cnx.Org/content/m23320/l.6/>. 
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of the first coordinate random variable X and values on the vertical axis (u-axis) correspond to values of Y 
We extend the computational strategy used for a single random variable. 

First, let us review the one-variable strategy. In this case, data consist of values t; and corresponding 
probabilities P (X = ti) arranged in matrices 

X=[ti,t2, ■■■ ,t n ] and PX=\P(X = t 1 ),P(X = t 2 ), ••• , P (X = t n )] (8.14) 

To perform calculations on Z = g (X), we we use array operations on X to form a matrix 

G=[g(h) g(t 2 ) ■■■ g(t n )] (8.15) 

which has g (ti) in a position corresponding to P (X = ti) in matrix PX. 

Basic problem. Determine P (g (X) s M), where M is some prescribed set of values. 

• Use relational operations to determine the positions for which g (ti) € M. These will be in a zero-one 
matrix N, with ones in the desired positions. 

• Select the P (X = ti) in the corresponding positions and sum. This is accomplished by one of the 
MATLAB operations to determine the inner product of JV and PX 

We extend these techniques and strategies to a pair of simple random variables, considered jointly. 

a. The data for a pair {X, Y} of random variables are the values of X and Y, which we may put in row 
matrices 

X = [tit 2 •••£„] andY = [u\U 2 ■ ■ ■ u m ] (8.16) 

and the joint probabilities P (X = U,Y = Uj) in a matrix P. We usually represent the distribution 
graphically by putting probability mass P (X = ti, Y = Uj) at the point (ti,Uj) on the plane. This 
joint probability may is represented by the matrix P with elements arranged corresponding to the 
mass points on the plane. Thus 

Phas elementP (X = ti,Y = Uj) atthe (ti, Uj) position (8-17) 

b. To perform calculations, we form computational matrices t and u such that — t has element t; at each 
(ti, Uj) position (i.e., at each point on the ith column from the left) — u has element Uj at each (ti, Uj) 
position (i.e., at each point on the jth row from the bottom) MATLAB array and logical operations 
on t,u,P perform the specified operations on U,Uj, and P (X = U,Y = Uj) at each (ti,Uj) position, 
in a manner analogous to the operations in the single- variable case. 

c. Formation of the t and u matrices is achieved by a basic setup m-procedure called jcalc. The data 
for this procedure are in three matrices: X = [ti,t 2 , ■ ■ ■ ,t n ] is the set of values for random variable 
X Y = [«i,it2,--- ,u m ] is the set of values for random variable Y, and P = [pij], where pij = 
P (X = ti,Y = Uj). We arrange the joint probabilities as on the plane, with X- values increasing to 
the right and Y-values increasing upward. This is different from the usual arrangement in a matrix, in 
which values of the second variable increase downward. The m-procedure takes care of this inversion. 
The m-procedure forms the matrices t and u, utilizing the MATLAB function meshgrid, and computes 
the marginal distributions for X and Y In the following example, we display the various steps utilized 
in the setup procedure. Ordinarily, these intermediate steps would not be displayed. 

Example 8.7: Setup and basic calculations 

3> jdemo4 '/, Call for data in file jdemo4.m 

3> jcalc '/, Call for setup procedure 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
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Enter row mat 


rix of VALUES of Y 


Use array operations on matrice 


> disp(P) 






0.0360 


0.0198 


0.0297 


0.0372 


0.0558 


0.0837 


0.0516 


0.0774 


0.1161 


0.0264 


0.0270 


0.0405 


> PX 






PX = 0.1512 


0.1800 


0.2700 


> PY 






PY = 0.1356 


0.4300 


0.3100 



> PX = sum(P) 

PX = 0.1512 0.1800 0.2700 

> PY = fliplr(sum(P')) 

PY = 0.1356 0.4300 0.3100 

> [t,u] = meshgrid(X,fliplr(Y)) ; 

> disp(t) 



-3 





1 


3 


5 


-3 





1 


3 


5 


-3 





1 


3 


5 


-3 





1 


3 


5 


» disp(u) 










2 


2 


2 


2 


2 


1 


1 


1 


1 


1 

















-2 


-2 


-2 


-2 


-2 



X, Y, PX, PY, t, u, and P 

7, Optional call for display of P 

0.0209 0.0180 

0.0589 0.0744 

0.0817 0.1032 

0.0285 0.0132 

7. Optional call for display of PX 

0.1900 0.2088 

7. Optional call for display of PY 

0.1244 
7. Steps performed by jcalc 
'/, Calculation of PX as performed by jcalc 

0.1900 0.2088 

'/, Calculation of PY (note reversal) 

0.1244 
'/, Formation of t , u matrices (note reversal) 
'/, Display of calculating matrix t 
7o A row of X-values for each value of Y 



7. Display of calculating matrix u 
7, A column of Y-values (increasing 
7. upward) for each value of X 



Suppose we wish to determine the probability P (X 2 — 3Y > l). Using array operations on t 
and u, we obtain the matrix G = [g {ti,Uj)\. 



G 



> G = t 

3 -6 

6 -3 

9 

15 6 

> M = G >= 
M - 1 

1 

1 

1 1 

> pM = M.*P 
pM = 

0.0360 
0.0372 
0.0516 
0.0264 

> PM = total (pM) 
PM = 0.7336 



3*u 
5 
2 

1 
7 




1 
1 









0.0270 



15 

1 
1 
1 
1 



19 
22 
25 
31 

1 
1 
1 
1 



'/, Formation of G = [g(t_i,u_j)] matrix 



7. Positions where G >= 1 



7, Selection of probabilities 







0.1161 

0.0405 



0.0209 
0.0589 
0.0817 
0.0285 



0.0180 
. 0744 
0.1032 
0.0132 



7, Total of selected probabilities 

y. p(g(x,Y) >= i) 



d. In Example 3 (Example 8.3: Distribution function for the selection problem in Example 8.1 (A selection 
problem)) from "Random Vectors and Joint Distributions" we note that the joint distribution function 
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Fxy is constant over any grid cell, including the left-hand and lower boundaries, at the value taken 
on at the lower left-hand corner of the cell. These lower left-hand corner values may be obtained 
systematically from the joint probability matrix P by a two step operation. 

• Take cumulative sums upward of the columns of P. 

• Take cumulative sums of the rows of the resultant matrix. 

This can be done with the MATLAB function cumsum, which takes column cumulative sums downward. 
By flipping the matrix and transposing, we can achieve the desired results. 

Example 8.8: Calculation of Fxy values for Example 3 (Example 8.3: Distribution 
function for the selection problem in Example 8.1 (A selection problem)) from 
"Random Vectors and Joint Distributions" 



> P = 0.1* [3 0; 6 0; 1] ; 



> FXY = flipud(cumsuir 


(flipud(P))) 


FXY = 






0.3000 


0.6000 


0.1000 





0.6000 


0.1000 








0.1000 


> FXY = cumsum(FXY')' 




FXY = 






0.3000 


0.9000 


1.0000 





0.6000 


0.7000 








0.1000 



'/, Cumulative column sums upward 



'/, Cumulative row sums 



U 



0.3 
3/10 



0.9 





1 

o ; o.6 


1 


16/10 




o ; o 





i 

i 




o !i 



1.0 



0.7 



0.1 
1/10 



Figure 8.7: The joint distribution for Example 3 (Example 8.3: Distribution function for the selection 
problem in Example 8.1 (A selection problem)) in "Random Vectors and Joint Distributions'. 



Comparison with Example 3 (Example 8.3: Distribution function for the selection problem in 
Example 8.1 (A selection problem)) from "Random Vectors and Joint Distributions" shows 
agreement with values obtained by hand. 
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The two step procedure has been incorprated into an m-procedure jddbn. As an example, 
return to the distribution in Example Example 8.7 (Setup and basic calculations) 
Example 8.9: Joint distribution function for Example 8.7 (Setup and basic calcu- 
lations) 



> jddbn 










Enter joint probability 


matrix 


(as on the 


plane) P 


To view joint 


distribut: 


ion function, call 


for FXY 


> disp(FXY) 










0.1512 


0.3312 


0.6012 


0.7912 


1.0000 


0.1152 


0.2754 


0.5157 


0.6848 


0.8756 


0.0780 


0.1824 


0.3390 


0.4492 


0.5656 


0.0264 


0.0534 


0.0939 


0.1224 


0.1356 



These values may be put on a grid, in the same manner as in Figure 2 (Figure 8.2) for Example 
3 (Example 8.3: Distribution function for the selection problem in Example 8.1 (A selection 
problem)) in "Random Vectors and Joint Distributions". 

e. As in the case of canonic for a single random variable, it is often useful to have a function version of the 
procedure jcalc to provide the freedom to name the outputs conveniently, function [x , y , t ,u,px,py ,p] 
= jcalcf (X,Y,P) The quantities x,y,t,u,px,py, and p may be given any desired names. 



8.2,2 Joint absolutely continuous random variables 

In the single-variable case, the condition that there are no point mass concentrations on the line ensures 
the existence of a probability density function, useful in probability calculations. A similar situation exists 
for a joint distribution for two (or more) variables. For any joint mapping to the plane which assigns zero 
probability to each set with zero area (discrete points, line or curve segments, and countable unions of these) 
there is a density function. 

Definition 

If the joint probability distribution for the pair {X, Y} assigns zero probability to every set of points 
with zero area, then there exists a joint density function fxy with the property 

P[(X,Y)eQ} = J J fxr (8.18) 

We have three properties analogous to those for the single-variable case: 

(fl) fxv > (f2) If fxr = l (f3) F XY (t, u)= I I fxY (8.19) 

J J R. 2 J — oo •' — oo 

At every continuity point for fxY, the density is the second partial 



f a ,i\ - d2FxY ^ u) (r on) 

/xy(f ' u) " dtdu (8 - 20) 

Now 

F x (t) = F XY (t, oo) = / / f XY (r, s) dsdr (8.21) 



OO J — OO 
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A similar expression holds for Fy (u). Use of the fundamental theorem of calculus to obtain the derivatives 
gives the result 



/oo /»oo 

fxY (t, s) ds and fy (u) = / f XY (r, u) 
-oo J — oo 



(111 



.22) 



Marginal densities. Thus, to obtain the marginal density for the first variable, integrate out the second 
variable in the joint density, and similarly for the marginal for the second variable. 

Example 8.10: Marginal density functions 

Let fxY (t, u) = 8tu < u < t < 1. This region is the triangle bounded by u = 0, u = t, and 
t = 1 (see Figure 8.8) 



f x (t) = / f XY (t, u) du = 8t udu = 4t 3 , < t < 1 



f Y ( u ) = / fxY (t, u) dt = 8u t dt = 4u (l - u 2 ) , < u < 1 

J J u 

P (0.5 < X < 0.75, Y > 0.5) = P [{X 7 Y) £ Q] where Q is the common part of the triangle with 
the strip between t = 0.5 and t = 0.75 and above the line u = 0.5. This is the small triangle 
bounded by u = 0.5, u = t, and t = 0.75. Thus 



.23) 
.24) 



r 3/4 r-t 



p=8 



/ / tududt = 25/256 w 0.0977 

Jl/2 J\I1 



.25) 




0.5 0.75 1.0 



Figure 8.8: Distribution for Example 8.10 (Marginal density functions) 
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Example 8.11: Marginal distribution with compound expression 

The pair {X, Y} has joint density fxY {t, u) = -^ (t + 2u) on the region bounded by t = 0, t = 2, 
u = 0, and u = max{l, t} (see Figure 8.9). Determine the marginal density fx- 

SOLUTION 

Examination of the figure shows that we have different limits for the integral with respect to u 
for < t < 1 and for 1 < t < 2. 



• For < t < 1 



fx(t) 



37 



(t + 2u)du=-^(t + l) 



.26) 



• For 1 < t < 2 



fx (*) = ^ / (* + 2u ) du=l ^j t2 



.27) 



We may combine these into a single expression in a manner used extensively in subsequent treat- 
ments. Suppose M = [0, 1] and N = (1, 2]. Then I M (t) = 1 for t e M (i.e., < t < 1) and zero 
elsewhere. Likewise, In (t) = 1 for t € N and zero elsewhere. We can, therefore express fx by 



f x{ t)=I M (t)^(t+l) + I N (t)^t 2 



.28) 



u =1 



(2,2) 



fxyftu) = (6/37)(t + 2u) 



t = 2 



Figure 8.9: Marginal distribution for Example 8.11 (Marginal distribution with compound expression). 
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8.2,3 Discrete approximation in the continuous case 

For a pair {X, Y} with joint density fxy, we approximate the distribution in a manner similar to that for a 
single random variable. We then utilize the techniques developed for a pair of simple random variables. If we 
have n approximating values t; for X and m approximating values uj for Y, we then have n- m pairs (ti, Uj), 
corresponding to points on the plane. If we subdivide the horizontal axis for values of X, with constant 
increments dx, as in the single- variable case, and the vertical axis for values of Y, with constant increments 
dy, we have a grid structure consisting of rectangles of size dx ■ dy. We select t; and Uj at the midpoint of 
its increment, so that the point (ti, Uj) is at the midpoint of the rectangle. If we let the approximating pair 
be {X*, Y*}, we assign 

Plj = P((X*, Y*) = {h, Uj)) = P(X* = ti, Y* = Uj) = P{{X, Y) in ijth rectangle) (8.29) 
As in the one- variable case, if the increments are small enough, 

P{{X, Y) e ijth rectangle) w dx ■ dy ■ f X y (*», Uj) (8.30) 

The m-procedure tuappr calls for endpoints of intervals which include the ranges of X and Y and for the 
numbers of subintervals on each. It then prompts for an expression for fxy (t, u), from which it determines 
the joint probability distribution. It calculates the marginal approximate distributions and sets up the 
calculating matrices t and u as does the m-process jcalc for simple random variables. Calculations are then 
carried out as for any joint simple pair. 

Example 8.12: Approximation to a joint continuous distribution 

f XY (t,u) =3 on < u < t 2 < 1 (8.31) 

Determine P (X < 0.8, Y > 0.1). 

3>~tuappr 
Enter~matrix~ [a~b] ~of ~X-range~eiidpoiiits~~ [0~1] 
Enter~matrix~ [c~d] ~of ~Y-range~endpoiiits~~ [0~1] 
Enter~number~of ~X~approximation~points~~200 
Enter~number~of ~Y~approximation~points~~200 
Enter~expression~f or "joint "density" ~3*(u~<=~t . ~2) 
Use~array~operations~on~X, ~Y,~PX, ~PY, ~t , ~u,~and~P 
>~M~=~(t~<=~0.8)&(u~>~0.1) ; 

3>~p~=~total(M. *P) °/.~Evaluation~of ~the~integral~with 

p~=~~~0 . 3355 °/„~Maple~gives~0 . 3352455531 

The discrete approximation may be used to obtain approximate plots of marginal distribution and density 
functions. 
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Marginal Density and Distribution for X 




Figure 8.10: Marginal density and distribution function for Example 8.13 (Approximate plots of 
marginal density and distribution functions). 



Example 8.13: Approximate plots of marginal density and distribution functions 

fxY (t, u) = 3m on the triangle bounded by u = 0, u < 1 + t, and u < 1 — t. 

3>~tuappr 
Enter~matrix~ [a~b] ~of ~X-range~endpoints~~ [-1~1] 
Enter~matrix~ [c~d] ~of ~Y-range~endpoints~~ [0~1] 
Enter~number~of ~X~approximation~points~~400 
Enter~iiumber~of ~Y~approximation~points~~200 
Enter~expression~f or~joint~density~~3*u. * (u<=min(l+t , 1-t) ) 
Use~array~operations~on~X, ~Y,~PX, ~PY, ~t , ~u,~and~P 
>~fx~=~PX/dx; 7„~Density~for~X~~(see~Figure~8.10) 

7„~Theoretical~(3/2)(l~-~|t|)-2 

>~fy~=~PY/dy; 7„~Density~f or~Y 

3>~FX~=~cumsum(PX) ; °/.~Distribution~f unction~f or~X~ (Figure~8 . 10) 

3>~FY~=~cumsum(PY) ; °/„~Distribution~f unction~f or~Y 

>~plot(X,fx,X,FX) °/„~Plotting~details~omitted 

These approximation techniques useful in dealing with functions of random variables, expectations, and 
conditional expectation and regression. 
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Exercise 8.1 (Solution on p. 215.) 

Two cards are selected at random, without replacement, from a standard deck. Let X be the 
number of aces and Y be the number of spades. Under the usual assumptions, determine the joint 
distribution and the marginals. 

Exercise 8.2 (Solution on p. 215.) 

Two positions for campus jobs are open. Two sophomores, three juniors, and three seniors apply. 
It is decided to select two at random (each possible pair equally likely). Let X be the number of 
sophomores and Y be the number of juniors who are selected. Determine the joint distribution for 
the pair {X, Y} and from this determine the marginals for each. 

Exercise 8.3 (Solution on p. 216.) 

A die is rolled. Let X be the number that turns up. A coin is flipped X times. Let Y be the 
number of heads that turn up. Determine the joint distribution for the pair {X, Y}. Assume 
P (X = k) = 1/6 for 1 < k < 6 and for each k, P (Y = j\X = k) has the binomial (k, 1/2) distri- 
bution. Arrange the joint matrix as on the plane, with values of Y increasing upward. Determine 
the marginal distribution for Y (For a MATLAB based way to determine the joint distribution see 
Example 7 (Example 14.7: A random number JV of Bernoulli trials) from "Conditional Expectation, 
Regression") 

Exercise 8.4 (Solution on p. 216.) 

As a variation of Exercise 8.3, Suppose a pair of dice is rolled instead of a single die. Determine 
the joint distribution for the pair {X, Y} and from this determine the marginal distribution for Y 

Exercise 8.5 (Solution on p. 217.) 

Suppose a pair of dice is rolled. Let X be the total number of spots which turn up. Roll the pair 
an additional X times. Let Y be the number of sevens that are thrown on the X rolls. Determine 
the joint distribution for the pair {X, Y} and from this determine the marginal distribution for Y 
What is the probability of three or more sevens? 

Exercise 8.6 (Solution on p. 218.) 

The pair {X, Y} has the joint distribution (in m-file npr08_06.m (Section 17.8.37: npr08_06)): 

X=[-2.3 -0.7 1.1 3.9 5.1] Y = [1.3 2.5 4.1 5.3] (8.32) 

0.0483 0.0357 0.0420 0.0399 0.0441 

0.0437 0.0323 0.0380 0.0361 0.0399 

0.0713 0.0527 0.0620 0.0609 0.0551 

0.0667 0.0493 0.0580 0.0651 0.0589 

Determine the marginal distributions and the corner values for Fxy- Determine P (X + Y > 2) 
and P{X>Y). 

Exercise 8.7 (Solution on p. 218.) 

The pair {X, Y} has the joint distribution (in m-file npr08_07.m (Section 17.8.38: npr08_07)): 

P(X = t, Y = u) (8.34) 



.33) 



3 This content is available online at <http://cnx.Org/content/m24244/l.4/>. 
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t = 


-3.1 


-0.5 


1.2 


2.4 


3.7 


4.9 


u = 7.5 


0.0090 


0.0396 


0.0594 


0.0216 


0.0440 


0.0203 


4.1 


0.0495 





0.1089 


0.0528 


0.0363 


0.0231 


-2.0 


0.0405 


0.1320 


0.0891 


0.0324 


0.0297 


0.0189 


-3.8 


0.0510 


0.0484 


0.0726 


0.0132 





0.0077 



Table 8.1 

Determine the marginal distributions and the corner values for Fxy- Determine 
P{1< X < 4,y > 4) and P(\X-Y\ < 2). 

Exercise 8.8 (Solution on p. 219.) 

The pair {X, Y} has the joint distribution (in m-file npr08_08.m (Section 17.8.39: npr08_08)): 



P(X = t, Y = u) 



.35) 



t = 


1 


3 


5 


7 


9 


11 


13 


15 


17 


19 


u = 12 


0.0156 


0.0191 


0.0081 


0.0035 


0.0091 


0.0070 


0.0098 


0.0056 


0.0091 


0.0049 


10 


0.0064 


0.0204 


0.0108 


0.0040 


0.0054 


0.0080 


0.0112 


0.0064 


0.0104 


0.0056 


9 


0.0196 


0.0256 


0.0126 


0.0060 


0.0156 


0.0120 


0.0168 


0.0096 


0.0056 


0.0084 


5 


0.0112 


0.0182 


0.0108 


0.0070 


0.0182 


0.0140 


0.0196 


0.0012 


0.0182 


0.0038 


3 


0.0060 


0.0260 


0.0162 


0.0050 


0.0160 


0.0200 


0.0280 


0.0060 


0.0160 


0.0040 


-1 


0.0096 


0.0056 


0.0072 


0.0060 


0.0256 


0.0120 


0.0268 


0.0096 


0.0256 


0.0084 


-3 


0.0044 


0.0134 


0.0180 


0.0140 


0.0234 


0.0180 


0.0252 


0.0244 


0.0234 


0.0126 


-5 


0.0072 


0.0017 


0.0063 


0.0045 


0.0167 


0.0090 


0.0026 


0.0172 


0.0217 


0.0223 



Table 8.2 

Determine the marginal distributions. Determine Fxy (10,6) and P (X > Y). 

Exercise 8.9 (Solution on p. 220.) 

Data were kept on the effect of training time on the time to perform a job on a production line. X 
is the amount of training, in hours, and Y is the time to perform the task, in minutes. The data 
are as follows (in m-file npr08_09.m (Section 17.8.40: npr08_09)): 



P(X = t, Y = u) 



(8.36) 



t = 


1 


1.5 


2 


2.5 


3 


u = 5 


0.039 


0.011 


0.005 


0.001 


0.001 


4 


0.065 


0.070 


0.050 


0.015 


0.010 


3 


0.031 


0.061 


0.137 


0.051 


0.033 


2 


0.012 


0.049 


0.163 


0.058 


0.039 


1 


0.003 


0.009 


0.045 


0.025 


0.017 
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Table 8.3 

Determine the marginal distributions. Determine Fxy (2,3) and P(Y/X > 1.25). 
For the joint densities in Exercises 10-22 below 

a. Sketch the region of definition and determine analytically the marginal density functions fx and fy. 

b. Use a discrete approximation to plot the marginal density fx and the marginal distribution function 

F x - 

c. Calculate analytically the indicated probabilities. 

d. Determine by discrete approximation the indicated probabilities. 

Exercise 8.10 (Solution on p. 220.) 

f XY (t, u) = 1 for < t < 1, < u < 2 (1 - t). 

P(X > 1/2, y > 1), P(0< X< 1/2, Y > 1/2), P{Y<X) (8.37) 

Exercise 8.11 (Solution on p. 221.) 

fxy (t,u) = 1/2 on the square with vertices at (1,0) , (2,1), (1,2), (0,1). 

P(X>1,Y>1), P(X< 1/2,1 <Y), P(Y<X) (8.38) 

Exercise 8.12 (Solution on p. 221.) 

f XY (t, u) = 4i (1 - u) for < t < 1, < u < 1. 

P(l/2 <X <3/4,y > 1/2), P(X < 1/2, y > 1/2), P{Y<X) (8.39) 

Exercise 8.13 (Solution on p. 222.) 

f XY (t, u) = | (t + u) for < t < 2, < u < 2. 

P(X> 1/2, y >l/2), P(0<X< l,y >1), P{Y<X) (8.40) 

Exercise 8.14 (Solution on p. 223.) 

f XY (t, u) = Aue- 2t for < t, < u < 1 

P(X<l,y>l), P(X>0.5,l/2 <y <3/4), P(X<y) (8.41) 



Exercise 8.15 (Solution on p. 223.) 

_3_ 

88 



f XY (t, u) = ^ (2t + 3u 2 ) for < t < 2, < u < 1 + t. 



F XY (1,1), P(X<l,y>l), P(|X-y|<l) (8.42) 

Exercise 8.16 (Solution on p. 224.) 

fxY {t, u) = 12t 2 u on the parallelogram with vertices (—1,0) , (0, 0) , (1,1), (0, 1) 

P(X < 1/2, y > 0), P(X < 1/2, y < 1/2), P(Y> 1/2) (8.43) 
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Exercise 8.17 

f XY (t,u) = ffiu for < t < 2, < u < min{l,2 - t} 

P(X <1,Y <1), P(X>1), P{X<Y) 

Exercise 8.18 

f XY (t, u) = ^ (t + 2m) for < t < 2, < u < max{2 - t, t) 

P(X >1,Y>1), P{Y<1), P{Y<X) 

Exercise 8.19 

f XY (t,u) = j4| (3i 2 + u), for < t < 2, < u < min{2,3 - t} 

P(X > 1,Y > 1),P(X < 1,Y < 1),P(Y <X) 



(Solution on p. 224.) 



,44) 



(Solution on p. 225.) 



,45) 



(Solution on p. 226.) 



,46) 



Exercise 8.20 

_i 

227 



(Solution on p. 227.) 

P{X < 1/2, y < 3/2), P(X < 1.5, y > 1), P{Y<X) (8.47) 



f XY (t, u) = ^§f {3t + 2tu) for < t < 2, < u < min{l + t, 2} 



Exercise 8.21 

f XY (t, u) = ^ (t + 2m) for < t < 2, < u < min{2t, 3 - t) 

P(X<1), P{X >1,Y <1), P(Y<X/2) 

Exercise 8.22 
fxY (t,u) = J [0 ,i] (t) | (t 2 + 2u) + /(i,2] (t) n* 2 " 2 for < « < 1. 



(Solution on p. 228.) 



.48) 



(Solution on p. 228.) 



P(l/2 < X < 3/2, y < 1/2) 



.49) 
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Solutions to Exercises in Chapter 8 



Solution to Exercise 8.1 (p. 211) 

Let X be the number of aces and Y be the number of spades. Define the events ASi, A;, S;, and Nj, 
i = 1,2, of drawing ace of spades, other ace, spade (other than the ace), and neither on the i selection. Let 
P(i,k) = P(X = i,Y = k). 

p(^ 1 S' 2 V5i7v 2 ) = g-. 

12 11 _ 132" 



(0,0) 


(0,1) 


(0,2) 


(1,0) 


(1,1) 


(1,2) 


(2,0) 


(2,1) 


(2,2) 



P {NiN 2 ^ - 36 - 35 - i^> 

W Q.~AT„\ — „ 

52 "51 T 52 51 2652 



( 52 - 51 3g 265^ 
! y ^\N 2 ) = 52 • 

P(S 1 S 2 ) = £»~ 



36 . 12 , 12 . 36 _ 864 



52 51 „2652„ 



P (A 1 N 2 \/ N^) - 3 .36 , 36. 3 __ 2U; 

A Q, /V„~\ I AL A <?„"S — 

52 51 ' 52 51 ' 52 51 ' 52 51 2652 



52 * 51 ' 52 " 51 2652 

P (A X S 2 V S X A 2 V AS X N 2 V mAS 2 ) = I v 

P {AS!S 2 V S1AS2) =5^ , 5T+M , 5T=2il2 

P{A 1 A 2 ) = l 2 . 2 - ' 

3 1 3 . 1 _ 6 



26 
3 ™12 1 12 . 3 1 1 . 36 1 36 . 1 _ 144 



52 51 2652 

P {ASiA 2 \/ A1AS2) = 52 • 5i r, 2 " .-,1 - 2(i.-,L> 

P(0) = 

'/„ type npr08_01 (Section~17 . 8 .32 : npr08_01) 
'/. file npr08_01.m (Section~17. 8 . 32 : npr08_01) 
'/, Solution for Exercise~8.1 
X = 0:2; 

Y = 0:2; 

Pn = [132 24 0; 864 144 6; 1260 216 6]; 

P = Pn/(52*51); 

disp('Data in Pn, P, X, Y') 

npr08_01 '/. Call for mfile 

Data in Pn, P, X, Y '/. Result 

PX = sum(P) 

PX = 0.8507 0.1448 0.0045 

PY = fliplr(sum(P')) 

PY = 0.5588 0.3824 0.0588 

Solution to Exercise 8.2 (p. 211) 

Let Ai, Bi, Ci be the events of selecting a sophomore, junior, or senior, respectively, on the ith trial. Let X 
be the number of sophomores and Y be the number of juniors selected. 
Set P(i,k) = P(X = i, Y = k) 
P(0,0) = P(C 1 C 2 ) = |.| =5! 
P{G,l)=P{B 1 C 2 ) + P{C 1 B 2 )-- 
P(0,2) = P(B 1 B 2 ) = |.f = «| 
P (1,0)= P(A 1 C 2 )+P(C 1 A 2 ) -- 
P(1 7 1) = P(A 1 B 2 ) + P(B 1 A 2 ): 
P(2,0) = P(A 1 A 2 )=|.i = ^ 
P(1,2) = P(2,1) = P(2,2) = 
PX = [30/56 24/56 2/56] PY = [20/56 30/56 6/56] 

'/. file npr08_02.m (Section~17 .8 . 33: npr08_02) 
'/, Solution for Exercise~8.2 
X = 0:2; 

Y = 0:2; 

Pn = [6 0; 18 12 0; 6 12 2]; 
P = Pn/56; 

disp('Data are in X, Y,Pn, P') 
npr08_02 (Section~17 . 8 . 33 : npr08_02) 



.3,3-3_18 

7 ^~ & 7 ~ 56 



3 , 3 2 _ 12 

7 " r 8 7 56 
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Data are in X, Y,Pn, P 

PX = sum(P) 

PX = 0.5357 0.4286 0.0357 

PY = fliplr(sum(P')) 

PY = 0.3571 0.5357 0.1071 

Solution to Exercise 8.3 (p. 211) 

P (X = i, Y = k) = P {X = i) P (Y = k\X = i) = (1/6) P(Y = k\X = i). 

'/. file npr08_03.m (Section~17 .8 . 34: npr08_03) 
'/, Solution for Exercise~8.3 
X = 1:6; 
Y = 0:6; 

P0 = zeros(6,7) ; 
for i = 1:6 

P0(i,l:i+1) 
end 

P = rot90(P0); '/, Rotate to orient as on the plane 
PY = f liplr (sum(P' )) ; 7. Reverse to put in normal order 
dispC Answers are in X, Y, P, PY') 
npr08_03 (Section~17 . 8 . 34: npr08_03) 
Answers are in X, Y, P, PY 



'/, Initialize 

'/, Calculate rows of Y probabilities 
(l/6)*ibinom(i,l/2,0:i) ; 



7, Call for solution m-file 



disp(P) 




















0.0026 


















0.0052 


0.0156 













0. 


,0104 


0.0260 


0.0391 










0.0208 


0, 


,0417 


0.0521 


0.0521 








.0417 


0.0625 





,0625 


0.0521 


0.0391 


0.0833 





.0833 


0.0625 


0. 


,0417 


0.0260 


0.0156 


0.0833 





.0417 


0.0208 


0. 


.0104 


0.0052 


0.0026 


disp(PY) 

0.1641 


o.: 


B125 0. 


,2578 0. 


1667 


0.0755 


0.0208 


0.0026 



Solution to Exercise 8.4 (p. 211) 



'/„ file npr08_04.m (Section~17 .8 . 35: npr08_04) 
'/, Solution for Exercise~8.4 
X = 2:12; 
Y = 0:12; 

PX = (1/36) *[1 234565432 1]; 
P0 = zeros(ll,13) ; 
for i = 1:11 

P0(i,l:i+2) = PX(i)*ibinom(i+l,l/2,0:i+l); 
end 

P = rot90(P0); 
PY = f liplr (sum(P')); 
dispC Answers are in X, Y, PY, P') 
npr08_04 (Section~17 . 8 . 35 : npr08_04) 
Answers are in X, Y, PY, P 
disp(P) 

Columns 1 through 7 
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0. 


,0005 



















0. 


.0013 





,0043 














0. 


.0022 


0, 


,0091 


0. 


,0152 











0.0035 





,0130 


0. 


.0273 





.0304 








0.0052 


0.0174 


0. 


.0326 


0, 


,0456 


0. 


,0380 





0.0069 


0.0208 


0.0347 


0. 


,0434 


0, 


,0456 





,0304 


0.0069 


0.0208 


0.0312 


0.0347 


0. 


,0326 





,0273 


0. 


,0152 


0.0139 


0.0208 


0.0208 


0.0174 


0. 


,0130 


0, 


,0091 


0, 


,0043 


0.0069 


0.0069 


0.0052 


0.0035 


0. 


,0022 





,0013 


0. 


,0005 


Columns 8 


through 11 



























0.0000 




















0.0000 


0.0001 

















0.0001 


0.0003 


0.0004 














0.0002 


0.0008 


0.0015 


0.0015 














0.0020 


0.0037 


0.0045 


0.0034 














0.0078 


0.0098 


0.0090 


0.0054 














0.0182 


0.0171 


0.0125 


0.0063 














0.0273 


0.0205 


0.0125 


0.0054 














0.0273 


0.0171 


0.0090 


0.0034 














0.0182 


0.0098 


0.0045 


0.0015 














0.0078 


0.0037 


0.0015 


0.0004 














0.0020 


0.0008 


0.0003 


0.0001 














0.0002 


0.0001 


0.0000 


0.0000 














disp(PY) 




















Columns 1 


through 7 


















0.0269 


0.1025 


0.1823 


0.2158 


0. 


.1954 


0, 


,1400 


0. 


.0806 


Columns 8 


through 13 


















0.0375 


0.0140 


0.0040 


0.0008 


0. 


,0001 


0, 


,0000 







Solution to Exercise 8.5 (p. 211) 



'/„ file npr08_05.m (Section~17 .8 . 36: npr08_05) 
'/, Data and basic calculations for Exercise~8.5 
PX = (1/36) *[1 234565432 1]; 
X = 2:12; 
Y = 0:12; 

P0 = zeros(ll,13) ; 
for i = 1:11 

P0(i,l:i+2) = PX(i)*ibinom(i+l,l/6,0:i+l); 
end 

P = rot90(P0); 
PY = fliplr(sum(P')); 
dispC Answers are in X, Y, P, PY') 
npr08_05 (Section~17 . 8 . 36 : npr08_05) 
Answers are in X, Y, P, PY 
disp(PY) 



Columns 1 through 7 

0.3072 0.3660 
Columns 8 through 13 



0.2152 



0.0828 



0.0230 



0.0048 



0.0008 
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0.0001 0.0000 0.0000 
Solution to Exercise 8.6 (p. 211) 



0.0000 



0.0000 



0.0000 



npr08_06 (Section~17. 8 . 37: npr08_06) 
Data are in X, Y, P 
jcalc 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
disp([X;PX]') 



-2.3000 




0.2300 




-0.7000 




0.1700 




1.1000 




0.2000 




3.9000 




0.2020 




5.1000 




0.1980 




disp([Y;PY] 


') 






1.3000 




0.2980 




2.5000 




0.3020 




4.1000 




0.1900 




5.3000 




0.2100 




jddbn 








Enter joint 


probability matrix 


(as on the plane) P 


To view joint 


distribution function, call for FXY 


disp(FXY) 








0.2300 




0.4000 0.6000 


0.8020 1.0000 


0.1817 




0.3160 0.4740 


0.6361 0.7900 


0.1380 




0.2400 0.3600 


0.4860 0.6000 


0.0667 




0.1160 0.1740 


0.2391 0.2980 



PI = total((t+u>2) .*P) 
PI = 0.7163 
P2 = total ((t>=u) .*P) 
P2 = 0.2799 

Solution to Exercise 8.7 (p. 211) 



npr08_07 (Section~17. 8 . 38: npr08_07) 
Data are in X, Y, P 
jcalc 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
disp([X;PX]') 



3. 


,1000 


0, 


,1500 


0. 


,5000 


0, 


,2200 


1. 


,2000 


0, 


,3300 


2. 


,4000 


0. 


,1200 


3. 


,7000 


0, 


,1100 
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0.0700 



') 



0.1929 
0.3426 
0.2706 
0.1939 



4.9000 

disp([Y;PY] 

-3.8000 

-2.0000 

4.1000 

7.5000 

jddbn 

Enter joint probability matrix (as on the plane) 
To view joint distribution function, call for FXY 
disp(FXY) 

0.7000 
0.5920 
0.4336 
0.1720 
M = (K=t)&(t<=4)&(u>4); 
PI = total (M.*P) 
PI = 0.3230 

P2 = total((abs(t-u)<=2) .*P) 
P2 = 0.3357 

Solution to Exercise 8.8 (p. 212) 



0. 


.1500 


0. 


,3700 


0. 


,1410 





,3214 


0. 


,0915 


0, 


,2719 


0. 


,0510 


0. 


,0994 



0, 


,8200 


0, 


,9300 


1. 


,0000 


0, 


,6904 





,7564 





,8061 


0, 


,4792 


0, 


,5089 


0, 


,5355 


0. 


,1852 





,1852 


0. 


,1929 



npr08_08 (Section~17. 8 . 39: npr08_08) 
Data are in X, Y, P 
jcalc 



Use array operations on matrices X, Y, PX, PY, t, 


u, and P 


disp([X;PX]') 








1.0000 


0.0800 






3.0000 


0.1300 






5.0000 


0.0900 






7.0000 


0.0500 






9.0000 


0.1300 






11.0000 


0.1000 






13.0000 


0.1400 






15.0000 


0.0800 






17.0000 


0.1300 






19.0000 


0.0700 






disp([Y;PY]') 








-5.0000 


0.1092 






-3.0000 


0.1768 






-1.0000 


0.1364 






3.0000 


0.1432 






5.0000 


0.1222 






9.0000 


0.1318 






10.0000 


0.0886 






12.0000 


0.0918 






F = total (((t 


<=10)&(u<=6)) .*P) 






F = 0.2982 








P = total ((t>u) . *P) 






P = 0.7390 
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Solution to Exercise 8.9 (p. 212) 

npr08_09 (Section~17. 8 .40: npr08_09) 
Data are in X, Y, P 
jcalc 



Use array operations on matrices X, Y, PX, PY, t, u, and P 
disp([X;PX]') 

1.0000 0.1500 

1.5000 0.2000 

2.0000 0.4000 

2.5000 0.1500 

3.0000 0.1000 

disp([Y;PY] ') 

1.0000 0.0990 

2.0000 0.3210 

3.0000 0.3130 

4.0000 0.2100 

5.0000 0.0570 

F = total(((t<=2)&(u<=3)) .*P) 
F = 0.5100 

P = total((u./t>=1.25).*P) 
P = 0.5570 

Solution to Exercise 8.10 (p. 213) 

Region is triangle with vertices (0,0), (1,0), (0,2). 



2(1-*) 

50) 



fx{t)= du = 2(1 -t), < t< 1 

Jo 

t-l-u/2 

f Y (u)= dt=l- u/2, < u < 2 

Jo 

Ml = {(t,u) : t > 1/2, u > 1} lies outside the triangle P((X,Y) e Ml) = 
M2 = {(£, u) : < t < 1/2, u > 1/2} has area in the trangle = 1/2 
M3 = the region in the triangle under u = t, which has area 1/3 



tuappr 
Enter matrix [a b] of X-range endpoints [0 1] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 200 
Enter number of Y approximation points 400 
Enter expression for joint density (t<=l)&(u<=2* (1-t) ) 
Use array operations on X, Y, PX, PY, t, u, and P 
fx = PX/dx; 
FX = cumsum(PX) ; 
plot (X,f x,X,FX) 7, Figure not reproduced 



8.51) 

8.52) 
8.53) 
8.54) 
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Ml = (t>0.5)&(u>l); 

PI = total (Ml. *P) 

PI = '/„ Theoretical = 

M2 = (t<=0.5)&(u>0.5); 

P2 = total (M2.*P) 

P2 = 0.5000 '/. Theoretical = 1/2 

P3 = total((u<=t) . *P) 

P3 = 0.3350 '/. Theoretical = 1/3 

Solution to Exercise 8.11 (p. 213) 

The region is bounded by the lines u = 1 + t, u = 1 — t, m = 3 — t, and u = t — 1 

fx (t) = /[o,i] (*) °- 5 Ii-t du + h,2] (t) 0-5 //_"' du = I m (t) t + / ( i i2 ] (t) (2 - t) = (8.55) 
fy (t) by symmetry 

Ml = {(t,u) : t > l,u > 1} has area in the trangle = 1/2, so PM\ = 1/4 (8.56) 

Ml = {(t,u) : t < 1/2, u > 1} has area in the trangle = 1/8, so PM2 = 1/16 (8.57) 

M3 = {(t,u) :u<t} has area in the trangle = 1, so PM3 = 1/2 (8.58) 



tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density .5* (u<=min(l+t ,3-t) )& ... 

(u>=max(l-t,t-l)) 
Use array operations on X, Y, PX, PY, t, u, and P 
fx = PX/dx; 
FX = cumsum(PX) ; 

plot(X,fx,X,FX) '/. Plot not shown 

Ml = (t>l)&(u>l); 
PM1 = total (Ml. *P) 

PM1 = 0.2501 '/. Theoretical = 1/4 

M2 = (t<=l/2)&(u>l); 
PM2 = total (M2.*P) 

PM2 = 0.0631 '/. Theoretical = 1/16 = 0.0625 

M3 = u<=t; 
PM3 = total (M3.*P) 
PM3 = 0.5023 '/. Theoretical = 1/2 

Solution to Exercise 8.12 (p. 213) 

Region is the unit square. 

fx(t)= / 4&(l-u)du = 2t, < t < 1 (8.59) 

Jo 

f Y (u)= At{l-u)dt = 2{l-u), 0<w<l (8.60) 

Jo 
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,3/4 ,1 ,1/2 ,1 

Pl = / 4i (1 - u) dudt = 5/64 P2 = / 4t (1 - w) dwdi = 1/16 (8.61) 

A/2 A/2 A A/2 

P3= / / 4i(l -u)dudt = 5/6 (8.62) 



tuappr 
Enter matrix [a b] of X-range endpoints [0 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density 4*t.*(l - u) 
Use array operations on X, Y, PX, PY, t, u, and P 
fx = PX/dx; 
FX = cumsum(PX) ; 

plot(X,fx,X,FX) 7, Plot not shown 

Ml = (l/2<t)&(t<3/4)&(u>l/2); 
PI = total(Ml.*P) 

PI = 0.0781 7, Theoretical = 5/64 = 0.0781 

M2 = (t<=l/2)&(u>l/2); 
P2 = total(M2.*P) 

P2 = 0.0625 7, Theoretical = 1/16 = 0.0625 

M3 = (u<=t) ; 
P3 = total(M3.*P) 
P3 = 0.8350 7. Theoretical = 5/6 = 0.8333 

Solution to Exercise 8.13 (p. 213) 

Region is the square < £ < 2, 0<w<2. 



fx(t) = lj (* + «) = \ (t + l) = fv (t), 0<t<2 




(8.63) 


1 / (t + u) dudt = 45/64 P2 = / (t + u) dudt = 

1/2 A/2 A J I 


= 1/4 


(8.64) 


P3= (t + u) dudt = 1/2 




(8.65) 



PI 



^o Jo 

tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density (1/8)* (t+u) 
Use array operations on X, Y, PX, PY, t, u, and 
fx = PX/dx; 
FX = cumsum(PX) ; 
plot(X,fx,X,FX) 
Ml = (t>l/2)&(u>l/2); 
PI = total(Ml.*P) 
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PI = 0.7031 '/. Theoretical = 45/64 = 0.7031 

M2 = (t<=l)&(u>l); 

P2 = total (M2.*P) 

P2 = 0.2500 '/. Theoretical = 1/4 

M3 = u<=t; 

P3 = total (M3.*P) 

P3 = 0.5025 '/. Theoretical = 1/2 

Solution to Exercise 8.14 (p. 213) 

Region is strip bounded by t = 0, u = 0, u = 1 

fx(t) = 2e- 2t , 0<t, f Y (u)=2u, 0<u<l, fxr = fxfr (8-66) 

PI = 0, P2 = 2e~ 2t dt / 2wdu = e _1 5/16 (8.67) 

JO. 5 J 1/2 

I' 1 I' 1 3 1 

P3 = 4 / / ue~ 2t dudt = -e~ 2 + - = 0.7030 (8.68) 

Jo Jt 2 2 

tuappr 
Enter matrix [a b] of X-range end/points [0 3] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 400 
Enter number of Y approximation points 200 
Enter expression for joint density 4*u. *exp(-2*t) 
Use array operations on X, Y, PX, PY, t, u, and P 
M2 = (t > 0.5)&(u > 0.5)&(u<3/4); 
p2 = total (M2.*P) 

p2 = 0.1139 '/. Theoretical = (5/16)exp(-l) = 0.1150 

p3 = total((t<u) .*P) 
p3 = 0.7047 '/. Theoretical = 0.7030 

Solution to Exercise 8.15 (p. 213) 

Region bounded by £ = 0, t = 2, u = 0, u = 1 + t 

fx (*) = 4 / ( 2t + 3 " 2 ) du=^-(l + t)(l + 4t + t 2 ) = ^-(l + 5t + M 2 + t 3 ) 7 0<t<2 (8.69) 

80 Jq 00 00 

3 r 2 ,_ „ os . , , s 3 r2 



f Y (u) = I [0A] (u)-J (2t+3u 2 )dt + I (h3] (u)-J (2t+3u 2 )dt= (8.70) 

J[o,i] («) ^ ( 6m2 + 4 ) + ^(1,3] («) ^ (3 + 2w + 8u 2 - 3u 3 ) (8.71) 

F XY {!,!)= [ [ f XY {t 7 u)dudt = 3/U (8.72) 

Jo Jo 

/■l i-i+t z-i z-i+t 

Pl= I I f XY {t,u)dudt = 41/352 P2 = / / xy (i, u) rfudi = 329/352 (8.73) 

Jo Ji Jo Ji 
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tuappr 
Enter matrix [a b] of X-range end/points [0 2] 
Enter matrix [c d] of Y-range endpoints [0 3] 
Enter number of X approximation points 200 
Enter number of Y approximation points 300 

Enter expression for joint density (3/88) *(2*t+3*u. "2) . *(u<=l+t) 
Use array operations on X, Y, PX, PY, t, u, and P 
fx = PX/dx; 
FX = cumsum(PX) ; 
plot(X,fx,X,FX) 
MF = (t<=l)&(u<=l); 
F = total (MF.*P) 

F = 0.0681 '/. Theoretical = 3/44 = 0.0682 

Ml = (t<=l)&(u>l); 
PI = total (Ml. *P) 

PI = 0.1172 '/. Theoretical = 41/352 = 0.1165 

M2 = abs(t-u)<l; 
P2 = total (M2.*P) 
P2 = 0.9297 '/. Theoretical = 329/352 = 0.9347 

Solution to Exercise 8.16 (p. 213) 

Region bounded by u = 0, u = t, u = 1, u = t + 1 



PI 



I t 2 udu + I (os] {t) 12 t 2 udu = I { _ lfi] {t)6t 2 {t +1) +/(o.i] 

Jt 


(i) 6r (1 - i 


a ) (8.74) 


f Y (u) = 12 / t 2 udt + 12u 3 - 12u 2 + 4u, < u < 1 

Ju-l 




(8.75) 


t'l t'l /"1/2 ru 

- 12 / / t 2 ududt = 33/80, P2 = 12 / / t 2 udtdu = 

J 1/2 J t JO Ju-l 


: 3/16 


(8.76) 


P3 = 1 - P2 = 13/16 




(8.77) 



tuappr 
Enter matrix [a b] of X-range endpoints [-1 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 400 
Enter number of Y approximation points 200 

Enter expression for joint density 12*u. *t . "2 . *( (u<=t+l)&(u>=t) ) 
Use array operations on X, Y, PX, PY, t, u, and P 
pi = total((t<=l/2) . *P) 



0.1875 
0.8125 



pi = 0.4098 


'/. 


Theoretical » 


■ 33/80 


M2 = (t<l/2)&(u<=l/2); 








p2 = total (M2.*P) 








p2 = 0.1856 


'/. 


Theoretical » 


= 3/16 


P3 = total((u>=l/2).*P) 








P3 = 0.8144 


'/. 


Theoretical » 


■ 13/16 
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Solution to Exercise 8.17 (p. 213) 

Region is bounded by t = 0, u = 0, u = 2, u = 2 — £ 



24 Z" 1 24 / ' 2 ~* 



/x 0) = /[o.i] (t) 7T / *ud« + / (1 . 2] (t) — / todw = (8.78) 

11 Jo n Jo 

[o,i]W^ + /(n,(t)^(2-t) 2 (8.79) 



/i 



24 r^~ u io 

f Y (u) = — tudt = —u(u - 2) i, 0<u<l (8.80) 

11 Jo 11 

24 f 1 f 1 24 f 2 /" 2_t 

Pl = — / tududt = 6/11 P2= — / tududt = 5/11 (8.81) 

11 Jo Jo 11 A Jo 

24 f 1 f 1 
P3=— / £udwdi = 3/11 (8.82) 

11 Jo Jt 

tuappr 
Enter matrix [a b] of X-range end/points [0 2] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 400 
Enter number of Y approximation points 200 
Enter expression for joint density (24/ll)*t . *u. * (u<=2-t) 
Use array operations on X, Y, PX, PY, t, u, and P 
Ml = (t<=l)&(u<=l); 
PI = total (Ml. *P) 

PI = 0.5447 '/. Theoretical = 6/11 = 0.5455 

P2 = total((t>l) .*P) 

P2 = 0.4553 '/. Theoretical = 5/11 = 0.4545 

P3 = total((t<u) . *P) 
P3 = 0.2705 '/. Theoretical = 3/11 = 0.2727 

Solution to Exercise 8.18 (p. 214) 

Region is bounded by t = 0, t = 2, u = 0, u = 2-t (0 < t < 1) , u=t (1 < t < 2) 

fx (t) = / [0 ,i] (*) ^J (t + 2«) d«+/ ( i, 2 ] (t) J / (* + 2 ") d« = /[o,i] (*) ^ (2 - t)+/(i,2] (*) ^* 2 (8-83) 



M«) = -fai] («) Jj / (t + 2u)dt + I M (v) — J (/-2,7)r//-— / (l-2u),ll 



"3 f 2 " 
.23 io 


~U q /*2 



.84) 



/ [0f i] («) ^ (2« + 1) + J (1 , 2] («) A (4 + 6u - Au 2 ) (8.85) 



3 f 2 /"' , , , ,. „ 3 ' : ' '■' 



Pl= — (t + 2u) dudt = 13/46, P2 = — / (t + 2u) dudi = 12/23 (8.86) 

23 J\ j\ 23 Jo Jo 



3 ,2 ,t 



P3 = — / (t + 2u) dudt = 16/23 (8.87) 

23 J 7 
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tuappr 
Enter matrix [a b] of X-range end/points [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 

Enter expression for joint density (3/23) *(t+2*u) . *(u<=max(2-t ,t) ) 
Use array operations on X, Y, PX, PY, t, u, and P 
Ml = (t>=l)&(u>=l); 
PI = total (Ml. *P) 
PI = 0.2841 

13/46 '/. Theoretical = 13/46 = 0.2826 

P2 = total((u<=l).*P) 

P2 = 0.5190 '/. Theoretical = 12/23 = 0.5217 

P3 = total((u<=t) . *P) 
P3 = 0.6959 '/. Theoretical = 16/23 = 0.6957 

Solution to Exercise 8.19 (p. 214) 
Region has two parts: (1) < t < 1, < u < 2 (2) 1 < t < 2, < u < 3 - t 

/x(t) = / [0 ,i](t)^| (3t 2 + u)du+I {h2] (t)^J (3t 2 + u)du= (8.88) 

/[o,i] (*) ^g (3i 2 + 1) + I(i,2] (t) ^ (9 - 6t + 19t 2 - 6i 3 ) (8.89) 

12 f' 2 12 r 3 ^ u 

fY(u) = I [0A] (u) — j (M 2 + u)dt + I (h2] (u) — j (M 2 + u)dt= (8.90) 

24 12 

I[o,i] («) yjq ( 4 + u ) + 7 (i,2] («) iTQ (27 - 24« + 8« 2 - U 3 ) (8.91) 

12 f 2 f 3- * 12 f' 1 f' 1 

Pl= / / (3i 2 + u) dudt = 41/179P2 = / / (3t 2 + u) dudt = 18/179 (8.92) 

179 J 1 J 1 179 J Jq 

12 /* 3 / 2 /** 12 Z* 2 f 3 "* 

P3=—- / (3t 2 + u) dudt + —- / (3t 2 + u) rfudt = 1001/1432 (8.93) 

179 7 Jo 179 7 3/2 7o 

tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density (12/179) * (3*t . ~2+u) . * ... 

(u<=min(2,3-t)) 
Use array operations on X, Y, PX, PY, t, u, and P 
fx = PX/dx; 
FX = cumsum(PX) ; 
plot(X,fx,X,FX) 
Ml = (t>=l)&(u>=l); 
PI = total (Ml. *P) 

PI = 2312 '/. Theoretical = 41/179 = 0.2291 

M2 = (t<=l)&(u<=l); 
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P2 = total (M2.*P) 

P2 = 0.1003 '/. Theoretical = 18/179 = 0.1006 

M3 = u<=min(t,3-t) ; 

P3 = total (M3.*P) 

P3 = 0.7003 •/. Theoretical = 1001/1432 = 0.6990 

Solution to Exercise 8.20 (p. 214) 

Region is in two parts: 

1. < t < 1, <u<l + t 

2. (2) 1 < t < 2, < u < 2 



rl+t i-2 

fx (t) = I[ ,i] (t) / fxY {t, u) du + J (li2] (£) / fxY (t, u) du 
Jo Jo 



>o Jo 

-1 rl+t in z-3/2 r 2 



12 f f 12 /" ' /" 

P2 = — / / (3i + 2iu) ducft + — / / (3t + 2tw) dwcft = 68/227 
227 Jq jj 227 7i Ji 



12 ^ '•* 



tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density (12/227) * (3*t+2*t . *u) . * 
(u<=min(l+t,2)) 

Use array operations on X, Y, PX, PY, t, u, and P 
Ml = (t<=l/2)&(u<=3/2); 
PI = total (Ml. *P) 

PI = 0.0384 '/. Theoretical = 139/3632 = 0.0383 

M2 = (t<=3/2)&(u>l); 
P2 = total (M2.*P) 

P2 = 0.3001 '/. Theoretical = 68/227 = 0.2996 

M3 = u<t; 
P3 = total (M3.*P) 
P3 = 0.6308 '/. Theoretical = 144/227 = 0.6344 



(8.94) 



12 120 

'[0,1] (*) 227 (* + 5t + 4 + J (l,2] (*) ^ ( 8 - 95 ) 

h («) = / [0 ,i] («) / fxr (t, «) dt + 7 (1;2] («) f fxr (t, «) dt = (8.96) 

JO Ju-l 

Vl H H ( 2u + 3 ) + J (i,2] («) ^ (2« + 3) (3 + 2u - u 2 ) (8.97) 

= '[o,i] («) |^ (2" + 3) + J (1 , 2] («) ^ (9 + 12« + u 2 - 2u 3 ) (8.98) 

12 r 1/2 r 1+t 

PI = — - / / (3i + 2£u) ciwft = 139/3632 (8.99) 

227 Jn Jo 



.100) 



P3 = — / / (3t + 2iu) rfudt = 144/227 (8.101) 

227 Jq J i 
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Solution to Exercise 8.21 (p. 214) 

Region bounded by t = 2, u = 2t (0 < t < 1) , 3 - t (1 < t < 2) 



fx (t) = / [0 ,i] (t) 



13 



(t +2«)du + 7(1,2] (t) 



13 



12 



(t + 2«) du = / [0 ,i] (t) —t z + J(i,2] (t) ^ (3 - t) 



13 



fy ( U ) = 7 [0 ,1] («) 



13 



/2 



(i + 2w) dt + /(i, 2 ] («) 



13 



3— u 



/2 



(t + 2u) d< 



i - 8 9 

/[o,i] («) ( 13 + i3« - 52 



« 2 )+ / (i,2]H(^ + 



9 6 21 



13 52 



"1 /»2£ /*2 /*1 

PI = / / (t + 2u) dudt = 4/13 P2 = (t+ 2u) dudt = 5/13 



o Jo 



1 Jo 



P3 



2 ft/2 



(t + 2u) dudt = 4/13 



13 



8.102) 
8.103) 

8.104) 
8.105) 
8.106) 



tuappr 
Enter matrix [a b] of X-range en.dpoin.ts [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 400 
Enter number of Y approximation points 400 

Enter expression for joint density (2/13) *(t+2*u) . *(u<=min(2*t ,3-t) ) 
Use array operations on X, Y, PX, PY, t, u, and P 
PI = total((t<l) .*P) 

PI = 0.3076 '/. Theoretical = 4/13 = 0.3077 

M2 = (t>=l)&(u<=l); 
P2 = total (M2.*P) 

P2 = 0.3844 '/. Theoretical = 5/13 = 0.3846 

P3 = total((u<=t/2) .*P) 
P3 = 0.3076 '/. Theoretical = 4/13 = 0.3077 

Solution to Exercise 8.22 (p. 214) 

Region is rectangle bounded by £ = 0, t = 2, u = 0, u = 1 

f XY (t, u) = / [0 ,i] (t) I (t 2 + 2«) + 7(1,2] (t) ^t 2 u 2 , 0<u<l 



fx (t) = J [0 ,i] (t) - / (i 2 + 2m) du + 7 (1 , 2] (t) 



14 



t 2 u 2 du = J [0>1] (t) - (t 2 + 1) + 7(i,2] (t) ^ 2 



M«) 



(t 2 + 2w) dt + 



14 



t 2 u 2 dt 



-u z < w < 1 



3 f 1 /" 1/2 9 f 3/2 r 1/2 

Pl = - / (t 2 + 2m) dudi + — / / t 2 u 2 dudt = 55/448 

8 J 1/2 Jo 14 7l Jo 



.107) 
.108) 
.109) 
.110) 



229 



tuappr 
Enter matrix [a b] of X-range end/points [0 2] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 400 
Enter number of Y approximation points 200 
Enter expression for joint density (3/8) * (t . ~2+2*u) . * (t<=l) 

+ (9/14)*(t.~2.*u.~2).*(t > 1) 
Use array operations on X, Y, PX, PY, t, u, and P 
M = (l/2<=t)&(t<=3/2)&(u<=l/2); 
P = total (M.*P) 
P = 0.1228 '/. Theoretical = 55/448 = 0.1228 
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Chapter 9 

Independent Classes of Random 
Variables 



9.1 Independent Classes of Random Variables 1 

9.1.1 Introduction 

The concept of independence for classes of events is developed in terms of a product rule. In this unit, we 
extend the concept to classes of random variables. 

9.1.2 Independent pairs 

Recall that for a random variable X, the inverse image X~ x (M) (i.e., the set of all outcomes uj e Q, which are 
mapped into M by X) is an event for each reasonable subset M on the real line. Similarly, the inverse image 
Y^ 1 (JV) is an event determined by random variable Y for each reasonable set N. We extend the notion of 
independence to a pair of random variables by requiring independence of the events they determine. More 
precisely, 

Definition 

A pair {X, Y} of random variables is (stochastically) independent iff each pair of events 
{X- 1 (M) , Y- 1 (N)} is independent. 

This condition may be stated in terms of the product rule 

P{X eM, Y e N) = P{X eM)P{Y £ N) for all (Borel) sets M, N (9.1) 

Independence implies 

F XY (t, u) = P{X e (-00, t],Ye (-00, u]) = P{X e (-00, t]) P{Y e (-00, u]) = (9.2) 

F x (t) F Y (u) V t,u (9.3) 

Note that the product rule on the distribution function is equivalent to the condition the product rule holds 
for the inverse images of a special class of sets {M,N} of the form M = (— 00, i] and N = (— 00,11]. An 
important theorem from measure theory ensures that if the product rule holds for this special class it holds 
for the general class of {M, N}. Thus we may assert 

The pair {X, Y} is independent iff the following product rule holds 

F XY {t, u) = F x (t) F Y (u) V t, u (9.4) 



1 This content is available online at <http://cnx.Org/content/m23321/l.5/>. 
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Example 9.1: An independent pair 

Suppose F XY (t, u) = (1 - e~ at ) (l - e^") < t, < u. Taking limits shows 



F x (t) = lira F X y (t, u) = 1 - e~ at and F Y (u) = Urn F XY (t, u) = 1 - e~ p " (9.5) 



t — >oo 



so that the product rule Fxy (t, u) = Fx (t) Fy (u) holds. The pair {X, Y} is therefore indepen- 
dent. 

If there is a joint density function, then the relationship to the joint distribution function makes it clear that 
the pair is independent iff the product rule holds for the density. That is, the pair is independent iff 

fxv (t, u) = f x (t) f Y («) Vt,« (9.6) 

Example 9.2: Joint uniform distribution on a rectangle 

Suppose the joint probability mass distributions induced by the pair {X, Y} is uniform on a rect- 
angle with sides I\ = [a, b] and I2 = [c, d}. Since the area is (b — a) (d — c), the constant value of 
fxy is 1/ (b — a) (d — c). Simple integration gives 

1 f d 
fx (t) = -, —, r / du = a<t<b and (9.7) 





1 


f d 1 


(b- 


- a) (d - c) . 
1 


1 c b — a 

,6 x 

— / dt = c < u < d 



fv («) = 77 rj^ r / dt = c < u < d (9i 

(b-a)(d-c)J a d-c 

Thus it follows that X is uniform on [a, b], Y is uniform on [c, d], and fxy (t, u) = fx (t) fy (u) for 
all t, u, so that the pair {X, Y} is independent. The converse is also true: if the pair is independent 
with X uniform on [a, b] and Y is uniform on [c, d], the the pair has uniform joint distribution on 
h x I 2 . 



9.1.3 The joint mass distribution 

It should be apparent that the independence condition puts restrictions on the character of the joint mass 
distribution on the plane. In order to describe this more succinctly, we employ the following terminology. 
Definition 

If M is a subset of the horizontal axis and JV is a subset of the vertical axis, then the cartesian product 
M x N is the (generalized) rectangle consisting of those points (t, u) on the plane such that t G M and 

ue N. 

Example 9.3: Rectangle with interval sides 

The rectangle in Example 9.2 (Joint uniform distribution on a rectangle) is the Cartesian product 
I\ x I 2 , consisting of all those points (t, u) such that a < t < b and c < u < d (i.e., t € I\ and 

u e h). 
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Mass in vertical 
strip is P(X in M) 



Mass in rectangle 

MX N 
P(X in M)P(Y in N) 



Mass in horizontal 
strip is P(Y in N) 



M 



Figure 9.1: Joint distribution for an independent pair of random variables. 



We restate the product rule for independence in terms of cartesian product sets. 

P{X eM, Y £ N) = P((X, Y) £ M x N) = P (X £ M) P (Y £ N) 



(9.9) 



Reference to Figure 9.1 illustrates the basic pattern. If M, N are intervals on the horizontal and vertical 
axes, respectively, then the rectangle M x N is the intersection of the vertical strip meeting the horizontal 
axis in M with the horizontal strip meeting the vertical axis in N. The probability X £ M is the portion of 
the joint probability mass in the vertical strip; the probability Y £ N is the part of the joint probability in 
the horizontal strip. The probability in the rectangle is the product of these marginal probabilities. 

This suggests a useful test for nonindependence which we call the rectangle test. We illustrate with a 
simple example. 
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P(Y in N) > 



P(X in M,Y in N) = 




P(X in M) > 



Figure 9.2: Rectangle test for nonindependence of a pair of random variables. 



Example 9.4: The rectangle test for nonindependence 

Supose probability mass is uniformly distributed over the square with vertices at (1,0), (2,1), (1,2), 
(0,1). It is evident from Figure 9.2 that a value of X determines the possible values of Y and 
vice versa, so that we would not expect independence of the pair. To establish this, consider the 
small rectangle M x N shown on the figure. There is no probability mass in the region. Yet 
P(X £ M) > and P (Y e N) > 0, so that 

P{X e M) P{Y e N) > 0, but P{{X,Y) e M x N) = 0. The product rule fails; hence the 
pair cannot be stochastically independent. 

Remark. There are nonindependent cases for which this test does not work. And it does not provide a test 
for independence. In spite of these limitations, it is frequently useful. Because of the information contained 
in the independence condition, in many cases the complete joint and marginal distributions may be obtained 
with appropriate partial information. The following is a simple example. 

Example 9.5: Joint and marginal probabilities from partial information 

Suppose the pair {X, Y} is independent and each has three possible values. The following four 
items of information are available. 



P (X = ti) = 0.2, P{Y = mi) = 0.3, P(X = h,Y = u 2 ) = 0.08 



(9.10) 



P(X = t 2 ,Y = Ul ) 



0.15 



(9.11) 



These values are shown in bold type on Figure 9.3. A combination of the product rule and the 
fact that the total probability mass is one are used to calculate each of the marginal and joint 
probabilities. For example P (X = t\) = 0.2 and P (X = ti, Y = m) 

= P(X = t 1 )P(Y = u 2 ) = 0.08 implies P (Y = u 2 ) = 0.4. Then P (Y = u 3 ) 
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= 1 — P (Y = «i) — P (Y = U2) = 0.3. Others are calculated similarly. There is no unique 
procedure for solution. And it has not seemed useful to develop MATLAB procedures to accomplish 
this. 



0.3 



0.4 



0.3 



0.06 0.15 

* a 


0.08 


0.20 


ir If u 

0.06 0.15 
O O O 



0.09 



0.12 



0.09 



0.2 0.5 0.3 

Originals in bold, calculated in italic. 



Figure 9.3: Joint and marginal probabilities from partial information. 



Example 9.6: The joint normal distribution 

A pair {X, Y} has the joint normal distribution iff the joint density is 



where 



1 (t, u) 



IXY (t, U) 



t~ /iX 



1 



2tv<jxO'y{^ — P 



2N1/2 



-Q{t,u)/2 



(9.12) 



1-p 2 



(TX 



2p 



t- Hx\ (u-hy 



(TX 



cry 



U — fly 



(9.13) 



The marginal densities are obtained with the aid of some algebraic tricks to integrate the joint 
density. The result is that X ~ N (fix, Ox) anc ^ Y ~ N (fiy,a Y ). If the parameter p is set to 
zero, the result is 



fxv (t, u) = fx (t) fy (u) 



(9.14) 



so that the pair is independent iff p = 0. The details are left as an exercise for the interested 
reader. 

Remark. While it is true that every independent pair of normally distributed random variables is joint 
normal, not every pair of normally distributed random variables has the joint normal distribution. 
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Example 9.7: A normal pair not joint normally distributed 

We start with the distribution for a joint normal pair and derive a joint distribution for a normal 
pair which is not joint normal. The function 

1 / t 2 u 2 \ 

H t,u)= ^e XP [----) (9.15) 

is the joint normal density for an independent pair (p = 0) of standardized normal random variables. 
Now define the joint density for a pair {X, Y} by 

j'xy (t, u ) = 2</> (t, u) in the first and third quadrants, and zero elsewhere (9.16) 

Both X ~ N (0, 1) and Y ~ N (0, 1). However, they cannot be joint normal, since the joint normal 
distribution is positive for all (t, u). 



9.1.4 Independent classes 

Since independence of random variables is independence of the events determined by the random variables, 
extension to general classes is simple and immediate. 
Definition 

A class {Xi : i e J} of random variables is (stochastically) independent iff the product rule holds for 
every finite subclass of two or more. 

Remark. The index set J in the definition may be finite or infinite. 

For a finite class {Xi : 1 < i < n}, independence is equivalent to the product rule 

n 

F Xl x 2 -x n (ti, t 2 , , t n ) = n F Xi (U) for all (ti, t 2 , ■ ■ • , t n ) (9.17) 

i=l 

Since we may obtain the joint distribution function for any finite subclass by letting the arguments for the 
others be oo (i.e., by taking the limits as the appropriate t,- increase without bound), the single product rule 
suffices to account for all finite subclasses. 

Absolutely continuous random variables 

If a class {Xi : i s J} is independent and the individual variables are absolutely continuous (i.e., have 
densities), then any finite subclass is jointly absolutely continuous and the product rule holds for the densities 
of such subclasses 

tn 

fxaXn-Xin (tn, Ui, ■■■ , tim) = Y[ fx ik (Uk) for all (ti, t 2 , ■■■ , t n ) (9.18) 

fe=l 

Similarly, if each finite subclass is jointly absolutely continuous, then each individual variable is absolutely 
continuous and the product rule holds for the densities. Frequently we deal with independent classes in 
which each random variable has the same marginal distribution. Such classes are referred to as iid classes 
(an acronym for independent, identically distributed). Examples are simple random samples from a given 
population, or the results of repetitive trials with the same distribution on the outcome of each component 
trial. A Bernoulli sequence is a simple example. 

9.1.5 Simple random variables 

Consider a pair {X, Y} of simple random variables in canonical form 

n m 

x = y, uiAt Y = J2 u i Ib j ( 9 - 19 ) 

t=l j=l 
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Since Ai = {X = ti} and Bj = {Y = Uj} the pair {X, Y} is independent iff each of the pairs {Ai, Bj} is 
independent. The joint distribution has probability mass at each point (ti, Uj) in the range of W = (X, Y). 
Thus at every point on the grid, 

P (X = U, Y = Uj ) = P(X = ti)P(Y = Uj) (9.20) 

According to the rectangle test, no gridpoint having one of the t; or Uj as a coordinate has zero probability 
mass . The marginal distributions determine the joint distributions. If X has n distinct values and Y has m 
distinct values, then the n + m marginal probabilities suffice to determine the m • n joint probabilities. Since 
the marginal probabilities for each variable must add to one, only (n — 1) + (in — 1) = m + n — 2 values are 
needed. 

Suppose X and Y are in affine form. That is, 

n m 

X = a n + Y, <hIe< Y = b + Y, bjI Fj (9.21) 

i=l 3 = 1 

Since A r = {X = t r } is the union of minterms generated by the E; and Bj = {Y = u s } is the union of 
minterms generated by the Fj, the pair {X, Y} is independent iff each pair of minterms {M a , Nf,} generated 
by the two classes, respectivly, is independent. Independence of the minterm pairs is implied by independence 
of the combined class 



{E t , Fj : 1 < i < n, 1 < j < m} (9.22) 



Calculations in the joint simple case are readily handled by appropriate m-functions and m-procedures. 
MATLAB and independent simple random variables 

In the general case of pairs of joint simple random variables we have the m-procedure jcalc, which uses 
information in matrices X, Y, and P to determine the marginal probabilities and the calculation matrices 
t and u. In the independent case, we need only the marginal distributions in matrices X, PX, Y, and PY 
to determine the joint probability matrix (hence the joint distribution) and the calculation matrices t and 
u. If the random variables are given in canonical form, we have the marginal distributions. If they are in 
affine form, we may use canonic (or the function form canonicf) to obtain the marginal distributions. 

Once we have both marginal distributions, we use an m-procedure we call icalc. Formation of the joint 
probability matrix is simply a matter of determining all the joint probabilities 

V (i,j) = P(X = U,Y = Uj ) =P(X = U)P(Y = Uj ) (9.23) 

Once these are calculated, formation of the calculation matrices t and u is achieved exactly as in jcalc. 
Example 9.8: Use of icalc to set up for joint calculations 

X ~=~[-4~-2~0~l~3] ; 
Y~=~[0~l~2~4] ; 
PX~=~0.01*[12~18~27~19~24] ; 
PY~=~0.01*[15~43~31~11] ; 
icalc 

Exit er~row~matrix~of ~X- values X 
Enter~row~matrix~of ~Y- values Y 
Enter~X~probabilities~~PX 
Enter" Y~probabilities~~PY 

~Use~array~operations~on~matrices~X, ~Y, ~PX,~PY, ~t , ~u, ~and~P 
disp(P) °/.~0pt ional~display "of ~the~ joint "matrix 

0.0132 0.0198 0.0297 0.0209 0.0264 

0.0372 0.0558 0.0837 0.0589 0.0744 
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0.0516 0.0774 0.1161 0.0817 0.1032 

0.0180 0.0270 0.0405 0.0285 0.0360 

disp(t) °/.~Calculation~matrix~t 

_4 _2 1 3 

_4 _2 1 3 

_4 _2 1 3 

_4 _2 1 3 

disp(u) °/.~Calculation~matrix~u 

4 4 4 4 4 

2 2 2 2 2 

1 1 1 1 1 

o o o o 

M~=~(t>=-3)&(t<=2); TM~=~[-3,~2] 

PM~=~total(M.*P) 7„~P(X~in~M) 

PM~=~~~0.6400 

N~=~(u>0)&(u.~2<=15); °/.~N~=~{u: ~u~>~0, ~u~2~<=~15> 

PN~=~total(N.*P) °/„~P(Y~in~N) 

PN~=~~~o.7400 

Q~=~M&N; '/„~Rectangle~MxN 

PQ~=~total(Q.*P) 7„~P((X,Y)~in~MxN) 

PQ~=~~~0.4736 

p~=~PM*PN 

p~~=~~~0.4736 °/„~P((X,Y)~iii~MxN)~=~P(X~iii~M)P(Y~iii~N) 

As an example, consider again the problem of joint Bernoulli trials described in the treatment of Composite 
trials (Section 4.3). 

Example 9.9: The joint Bernoulli trial of Example 4.9. 

1 Bill and Mary take ten basketball free throws each. We assume the two seqences of trials are 
independent of each other, and each is a Bernoulli sequence. 
Mary: Has probability 0.80 of success on each trial. 
Bill: Has probability 0.85 of success on each trial. 
What is the probability Mary makes more free throws than Bill? 
SOLUTION 

Let X be the number of goals that Mary makes and Y be the number that Bill makes. Then 
X ~ binomial (10, 0.8) and Y ~ binomial (10, 0.85). 

X~=~0:10; 
Y~=~0:10; 

PX~=~ibinom(10, 0.8.X) ; 
PY~=~ibinom(10,0.85,Y) ; 
icalc 

Enter~row~matrix~of ~X-values X~~°/,~Could~enter~0 : 10 
Enter~row~matrix~of ~Y-values Y~~°/.~Could~eiiter~0 : 10 
Enter~X~probabilities~~PX /o~Could~enter~ibiiiom(10,0 . 8,X) 

Enter~Y~probabilities~~PY , /.~Could~enter~ibinom(10,0 . 85, Y) 

~Use~array~operations~on~matrices~X, ~Y, ~PX,~PY, ~t , ~u, ~and~P 
PM~=~total((t>u) . *P) 

PM~= 0.2738 °/.~Agrees~with~solution~in~Example 9 from "Composite Trials' 

Pe~=~total( (u==t) . *P) /o~Additional~iiif ormation~is~more~easily 

Pe~= . 2276 /.~obtaiiied~triaii~in~the~event "formulation 

Pm~=~total((t>=u) . *P) °/.~of "Example 9 from "Composite Trials". 

Pm~=~~0.5014 
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Example 9.10: Sprinters time trials 

Twelve world class sprinters in a meet are running in two heats of six persons each. Each runner has a 
reasonable chance of breaking the track record. We suppose results for individuals are independent. 
First heat probabilities: 0.61 0.73 0.55 0.81 0.66 0.43 
Second heat probabilities: 0.75 0.48 0.62 0.58 0.77 0.51 
Compare the two heats for numbers who break the track record. 
SOLUTION 

Let X be the number of successes in the first heat and Y be the number who are successful in the 
second heat. Then the pair {X, Y} is independent. We use the m-function canonicf to determine 
the distributions for X and for Y, then icalc to get the joint distribution. 

cl~=~[ones(l,6)~0] ; 
c2~=~[ones(l,6)~0] ; 

Pl~=~[0.61~0.73~0.55~0.81~0.66~0.43] ; 
P2~=~ [0 . 75~0 . 48~0 . 62~0 . 58~0 . 77~0 . 51] ; 
[X,PX]~=~ canonicf (cl ,minprob(Pl) ) ; 
[Y,PY]~=~ canonicf (c2,minprob(P2) ) ; 
icalc 

Enter~row~matrix~of ~X- values X 
Enter~row~matrix~of ~Y- values Y 
Enter~X~probabilities~~PX 
Enter" Y~probabilities~~PY 

~Use~array~operations~on~matrices~X, ~Y, ~PX,~PY, ~t , ~u, ~and~P 
Pml~=~total((t>u) . *P) °/„~Prob~first~heat~has~most 
Pml~=~~0.3986 

Pm2~=~total ( (u>t) . *P) /.~Prob~second~rieat~rias~most 
Pm2~=~~0.3606 

Peq~=~total ( (t==u) . *P) ~~7,~Prob~both~have~the~same 
Peq~=~~0.2408 

Px3~=~(X>=3)*PX' °/„~Prob~first~has~3~or~more 

Px3~=~~0.8708 

Py3~=~ (Y>=3)*PY' °/.~Prob~second~lias~3~or~more 

Py3~=~~0.8525 

As in the case of jcalc, we have an m-function version icalcf 

[x,y,t,u,px,py,p] = icalcf (X : Y,PX,PY) (9.24) 

We have a related m-function idbn for obtaining the joint probability matrix from the marginal probabilities. 
Its formation of the joint matrix utilizes the same operations as icalc. 

Example 9.11: A numerical example 



PX~=~0.1*[3~5~2] ; 






PY~=~0 . 01* [20~15~40~25] 


J 




P~~=~idbn(PX,PY) 

P~= 

0.0750 0.1250~~~ 






~0. 


,0500 


0.1200 0.2000~~~ 


~0. 


.0800 


0.0450 0.0750~~~ 


~0. 


,0300 


0.0600 0.1000~~~ 


~0. 


,0400 
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An m- procedure itest checks a joint distribution for independence. It does this by calculating the 
marginals, then forming an independent joint test matrix, which is compared with the original. We 
do not ordinarily exhibit the matrix P to be tested. However, this is a case in which the product 
rule holds for most of the minterms, and it would be very difficult to pick out those for which it 
fails. The m-procedure simply checks all of them. 



Example 


9.12 






'/, Joint 


~matrix~ 


in~dataf ile' 




idemol 








idemol 


P~=~~0.0091~ 


"0.0147" 


"0.0035" 


"0.0049" 


"0.0105" 


~0.0161~ 


~0.0112 




0.0117~ 


"0.0189" 


"0.0045" 


"0.0063" 


"0.0135" 


~0.0207~ 


"0.0144 




0.0104~ 


"0.0168" 


"0.0040" 


"0.0056" 


"0.0120" 


~0.0184~ 


"0.0128 




0.0169~ 


"0.0273" 


"0.0065" 


"0.0091" 


"0.0095" 


~0.0299~ 


"0.0208 




0.0052~ 


"0.0084" 


"0.0020" 


"0.0028" 


"0.0060" 


~0.0092~ 


"0.0064 




0.0169~ 


"0.0273" 


"0.0065" 


"0.0091" 


"0.0195" 


~0.0299~ 


"0.0208 




0.0104~ 


"0.0168" 


"0.0040" 


"0.0056" 


"0.0120" 


~0.0184~ 


"0.0128 




0.0078~ 


"0.0126" 


"0.0030" 


"0.0042" 


~0.0190~ 


~0.0138~ 


"0.0096 




0.0117~ 


"0.0189" 


"0.0045" 


"0.0063" 


~0.0135~ 


~0.0207~ 


"0.0144 




0.0091~ 


"0.0147" 


"0.0035" 


"0.0049" 


~0.0105~ 


~0.0161~ 


"0.0112 




0.0065~ 


"0.0105" 


"0.0025" 


"0.0035" 


~0.0075~ 


~0.0115~ 


"0.0080 




0.0143" 


"0.0231" 


~0.0055~ 


"0.0077" 


~0.0165~ 


~0.0253~ 


"0.0176 




itest 
















Enter~matrix 


"of" joint "probabilities" 


~P 








The~pair~{X, 


Y}~is~N0T~independent 


"/."Result 


"of "test 







To~see~where~the~product~rule~f ails , ~call~f or~D 
disp(D) 7o~0ptional~call~f or~D 







1 1 1 1 1 1 1 

o o o o o o 

o o o o o o 

o o o o o o 

1 1 1 1 1 1 1 

o o o o o o 

o o o o o o 

o o o o o o 

o o o o o o 

Next, we consider an example in which the pair is known to be independent. 
Example 9.13 

jdemo3 7,~call~f or~data~in~m-f ile 

disp(P) °/o~call~to~display~P 

0.0132 0.0198 0.0297 0.0209 0.0264 

0.0372 0.0558 0.0837 0.0589 0.0744 

0.0516 0.0774 0.1161 0.0817 0.1032 

0.0180 0.0270 0.0405 0.0285 0.0360 



itest 



241 

Enter~matrix~of "joint "probabilities" ~P 

The~pair~{X, Y}~is~ independent °/.~Result~of "test 

The procedure icalc can be extended to deal with an independent class of three random variables. We call 
the m-procedure icalc3. The following is a simple example of its use. 

Example 9.14: Calculations for three independent random variables 

X~=~0:4; 
Y~=~l:2:7; 
Z~=~0:3:12; 
PX~=~0.1*[1~3~2~3~1] ; 
PY~=~0.1*[2~2~3~3] ; 
PZ~=~0.1*[2~2~1~3~2] ; 
icalc3 

Enter~row~matrix~of~X- values X 
Enter~row~matrix~of~Y- values Y 
Enter~row~matrix~of~Z- values Z 
Enter~X~probabilities~~PX 
Enter" Y~probabilities~~PY 
Enter~Z~probabilities~~PZ 
Use~array~operations~on~matrices~X, ~Y,~Z, 
PX,~PY,~PZ,~t,~u,~v,~and~P 

G~=~3*t~+~2*u~-~4*v; °/.~W~=~3X~+~2Y~-4Z 

[W,PW]~=~csort(G,P) ; "/."Distribution"! or~W 

PG~=~total((G>0).*P) rP(g(X,Y,Z)~>~0) 

PG~=~~0.3370 

Pg~=~(W>0)*PW °/„~P(Z~>~0) 

Pg~=~~0.3370 

An m-procedure icalc4 to handle an independent class of four variables is also available. Also several 
variations of the m-function mgsum and the m-function diidsum are used for obtaining distributions for 
sums of independent random variables. We consider them in various contexts in other units. 

9.1.6 Approximation for the absolutely continuous case 

In the study of functions of random variables, we show that an approximating simple random variable X s of 
the type we use is a function of the random variable X which is approximated. Also, we show that if {X, Y} 
is an independent pair, so is {g (X) ,h(Y)} for any reasonable functions g and h. Thus if {X, Y} is an 
independent pair, so is any pair of approximating simple functions {X s , Y s } of the type considered. Now it is 
theoretically possible for the approximating pair {X s , Y s } to be independent, yet have the approximated pair 
{X, Y} not independent. But this is highly unlikely. For all practical purposes, we may consider {X, Y} to 
be independent iff {X S ,Y S } is independent. When in doubt, consider a second pair of approximating simple 
functions with more subdivision points. This decreases even further the likelihood of a false indication of 
independence by the approximating random variables. 

Example 9.15: An independent pair 

Suppose X ~ exponential (3) and Y ~ exponential (2) with 

f XY (£, u ) = 6e- 3t e- 2u = 6e-( 34+2 ") t > 0, u > (9.25) 

Since e~ 12 w 6 x 10~ 6 , we approximate X for values up to 4 and Y for values up to 6. 
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tuappr 
Enter~matrix~ [a~b] ~of ~X-raiige~eiidpoiiits~~ [0~4] 
Enter~matrix~ [c~d] ~of ~Y-range~endpoints~~ [0~6] 
Enter~iiumber~of ~X~approximation~points~~200 
Enter~iiumber~of ~Y~approximation~points~~300 
Exit er~expression~f or "joint "density" ~6*exp(-(3*t~+~2*u) ) 
Use~array~operations~on~X, ~Y,~PX, ~PY, ~t , ~u,~and~P 
itest 

Enter~matrix~of "joint "probabilities" ~P 
The "pair ~ {X , Y} ~ i s ~ independent 

Example 9.16: Test for independence 

The pair {X, Y} has joint density fxy {t, u) = 4tu0 < t < 1,0 < u < 1. It is easy enough to 
determine the marginals in this case. By symmetry, they are the same. 

f x (t)=U udu = 2t, 0<t<l (9.26) 

Jo 

so that fxy = fxfy which ensures the pair is independent. Consider the solution using tuappr 
and itest. 

tuappr 
Enter~matrix~ [a~b] ~of ~X-range~endpoints~~ [0~1] 
Enter~matrix~ [c~d] ~of ~Y-range~endpoints~~ [0~1] 
Enter~number~of ~X~approximation~points~~100 
Enter~number~of ~Y~approximation~points~~100 
Enter~expression~for~joint~density~~4*t . *u 
Use~array~operations~on~X, ~Y,~PX, ~PY, ~t , ~u,~and~P 
itest 

Enter~matrix~of "joint "probabilities" ~P 
The "pair ~ {X , Y} ~ i s ~ independent 



9.2 Problems on Independent Classes of Random Variables 2 

Exercise 9.1 (Solution on p. 247.) 

The pair {X, Y} has the joint distribution (in m-file npr08_06.m (Section 17.8.37: npr08_06)): 

X=[-2.3 -0.7 1.1 3.9 5.1] Y = [1.3 2.5 4.1 5.3] (9.27) 

0.0483 0.0357 0.0420 0.0399 0.0441 

0.0437 0.0323 0.0380 0.0361 0.0399 

0.0713 0.0527 0.0620 0.0609 0.0551 

0.0667 0.0493 0.0580 0.0651 0.0589 

Determine whether or not the pair {X, Y} is independent. 



(9.28) 



2 This content is available online at <http://cnx.Org/content/m24298/l.4/>. 
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Exercise 9.2 (Solution on p. 247.) 

The pair {X, Y} has the joint distribution (in m-file npr09_02.m (Section 17.8.41: npr09_02)): 



X 



-3.9 



1.7 1.5 2.8 4.1] Y=[-2 1 2.6 5.1] 



(9.29) 



P 



0.0589 0.0342 0.0304 0.0456 0.0209 

0.0961 0.0556 0.0498 0.0744 0.0341 

0.0682 0.0398 0.0350 0.0528 0.0242 

0.0868 0.0504 0.0448 0.0672 0.0308 



(9.30) 



Determine whether or not the pair {X, Y} is independent. 

Exercise 9.3 (Solution on p. 247.) 

The pair {X, Y} has the joint distribution (in m-file npr08_07.m (Section 17.8.38: npr08_07)): 



P(X = t, Y = u) 



(9.31) 



t = 


-3.1 


-0.5 


1.2 


2.4 


3.7 


4.9 


u = 7.5 


0.0090 


0.0396 


0.0594 


0.0216 


0.0440 


0.0203 


4.1 


0.0495 





0.1089 


0.0528 


0.0363 


0.0231 


-2.0 


0.0405 


0.1320 


0.0891 


0.0324 


0.0297 


0.0189 


-3.8 


0.0510 


0.0484 


0.0726 


0.0132 





0.0077 



Table 9.1 

Determine whether or not the pair {X, Y} is independent. 
For the distributions in Exercises 4-10 below 

a. Determine whether or not the pair is independent. 

b. Use a discrete approximation and an independence test to verify results in part (a). 

Exercise 9.4 (Solution on p. 247.) 

Jxy (t,u) = 1/ir on the circle with radius one, center at (0,0). 
Exercise 9.5 (Solution on p. 248.) 

fxY (t,u) = 1/2 on the square with vertices at (1,0), (2,1), (1,2), (0,1) (see Exercise 11 
(Exercise 8.11) from "Problems on Random Vectors and Joint Distributions"). 

Exercise 9.6 (Solution on p. 248.) 

Jxy (t, u) = 4:t (1 — u) for < t < 1, < u < 1 (see Exercise 12 (Exercise 8.12) from "Problems 
on Random Vectors and Joint Distributions"). 

Exercise 9.7 (Solution on p. 248.) 

fxY {t, u) = | (t + u) for < t < 2, < u < 2 (see Exercise 13 (Exercise 8.13) from "Problems on 
Random Vectors and Joint Distributions"). 

Exercise 9.8 (Solution on p. 249.) 

fxY{t,u) = 4ue~ 2t for < t, < u < 1 (see Exercise 14 (Exercise 8.14) from "Problems on 
Random Vectors and Joint Distributions"). 



Exercise 9.9 



(Solution on p. 249.) 



Jxy {t, u) = Ylt u on the parallelogram with vertices (—1,0) , (0, 0) , (1,1), (0, 1) 
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(see Exercise 16 (Exercise 8.16) from "Problems on Random Vectors and Joint Distributions"). 

Exercise 9.10 (Solution on p. 249.) 

fxY (t,u) = jjtu for < t < 2, < u < min{l,2 — t} (see Exercise 17 (Exercise 8.17) from 
"Problems on Random Vectors and Joint Distributions"). 

Exercise 9.11 (Solution on p. 250.) 

Two software companies, MicroWare and BusiCorp, are preparing a new business package in time 
for a computer trade show 180 days in the future. They work independently. MicroWare has 
anticipated completion time, in days, exponential (1/150). BusiCorp has time to completion, in 
days, exponential (1/130). What is the probability both will complete on time; that at least one 
will complete on time; that neither will complete on time? 

Exercise 9.12 (Solution on p. 250.) 

Eight similar units are put into operation at a given time. The time to failure (in hours) of each 
unit is exponential (1/750). If the units fail independently, what is the probability that five or more 
units will be operating at the end of 500 hours? 

Exercise 9.13 (Solution on p. 250.) 

The location of ten points along a line may be considered iid random variables with symmytric 
triangular distribution on [1,3]. What is the probability that three or more will lie within distance 
1/2 of the point t = 2? 

Exercise 9.14 (Solution on p. 250.) 

A Christmas display has 200 lights. The times to failure are iid, exponential (1/10000). The 
display is on continuously for 750 hours (approximately one month). Determine the probability the 
number of lights which survive the entire period is at least 175, 180, 185, 190. 

Exercise 9.15 (Solution on p. 250.) 

A critical module in a network server has time to failure (in hours of machine time) exponential 
(1/3000). The machine operates continuously, except for brief times for maintenance or repair. The 
module is replaced routinely every 30 days (720 hours), unless failure occurs. If successive units 
fail independently, what is the probability of no breakdown due to the module for one year? 

Exercise 9.16 (Solution on p. 250.) 

Joan is trying to decide which of two sales opportunities to take. 

• In the first, she makes three independent calls. Payoffs are $570, $525, and $465, with 
respective probabilities of 0.57, 0.41, and 0.35. 

• In the second, she makes eight independent calls, with probability of success on each call 
p = 0.57. She realizes $150 profit on each successful sale. 

Let X be the net profit on the first alternative and Y be the net gain on the second. Assume the 
pair {X, Y} is independent. 

a. Which alternative offers the maximum possible gain? 

b. Compare probabilities in the two schemes that total sales are at least $600, $900, $1000, 
$1100. 

c. What is the probability the second exceeds the first — i.e., what is P (Y > X)l 

Exercise 9.17 (Solution on p. 251.) 

Margaret considers five purchases in the amounts 5, 17, 21, 8, 15 dollars with respective probabilities 
0.37, 0.22, 0.38, 0.81, 0.63. Anne contemplates six purchases in the amounts 8, 15, 12, 18, 15, 12 
dollars, with respective probabilities 0.77, 0.52, 0.23, 0.41, 0.83, 0.58. Assume that all eleven 
possible purchases form an independent class. 

a. What is the probability Anne spends at least twice as much as Margaret? 

b. What is the probability Anne spends at least $30 more than Margaret? 
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Exercise 9.18 (Solution on p. 252.) 

James is trying to decide which of two sales opportunities to take. 

• In the first, he makes three independent calls. Payoffs are $310, $380, and $350, with respective 
probabilities of 0.35, 0.41, and 0.57. 

• In the second, he makes eight independent calls, with probability of success on each call 
p = 0.57. He realizes $100 profit on each successful sale. 

Let X be the net profit on the first alternative and Y be the net gain on the second. Assume the 
pair {X, Y} is independent. 

• Which alternative offers the maximum possible gain? 

• What is the probability the second exceeds the first — i.e., what is P (Y > X)l 

• Compare probabilities in the two schemes that total sales are at least $600, $700, $750. 

Exercise 9.19 (Solution on p. 252.) 

A residential College plans to raise money by selling "chances" on a board. There are two games: 

Game 1: Pay $5 to play; win $20 with probability pi = 0.05 (one in twenty) 
Game 2: Pay $10 to play; win $30 with probability pi = 0.2 (one in five) 

Thirty chances are sold on Game 1 and fifty chances are sold on Game 2. If X and Y are the profits 
on the respective games, then 

X = 30 • 5 - 20Ni and Y = 50 • 10 - 30iV 2 (9.32) 

where Ni, N2 are the numbers of winners on the respective games. It is reasonable to suppose N\ ~ 
binomial (30, 0.05) and N2 ~ binomial (50, 0.2). It is reasonable to suppose the pair {N\, N2} is 
independent, so that {X, Y} is independent. Determine the marginal distributions for X and Y 
then use icalc to obtain the joint distribution and the calculating matrices. The total profit for the 
College is Z = X + Y. What is the probability the College will lose money? What is the probability 
the profit will be $400 or more, less than $200, between $200 and $450? 

Exercise 9.20 (Solution on p. 253.) 

The class {X, Y, Z} of random variables is iid (independent, identically distributed) with common 
distribution 

X = [-5 - 1 3 4 7] PX = 0.01 * [15 20 30 25 10] (9.33) 

Let W = 3X — AY + 2Z. Determine the distribution for W and from this determine P (W > 0) 
and P (—20 < W < 10). Do this with icalc, then repeat with icalc3 and compare results. 

Exercise 9.21 (Solution on p. 254.) 

The class {^4, B, C, D, E, F} is independent; the respective probabilites for these events are 
{0.46, 0.27, 0.33, 0.47, 0.37, 0.41}. Consider the simple random variables 

X = 3I A -9I B +4I C , Y = -21 D + QI E + 21 F - 3, and Z = 2X - 3Y (9.34) 

Determine P (Y > X), P (Z > 0), P (5 < Z < 25). 

Exercise 9.22 (Solution on p. 254.) 

Two players, Ronald and Mike, throw a pair of dice 30 times each. What is the probability Mike 
throws more "sevens" than does Ronald? 

Exercise 9.23 (Solution on p. 254.) 

A class has fifteen boys and fifteen girls. They pair up and each tosses a coin 20 times. What is 
the probability that at least eight girls throw more heads than their partners? 
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Exercise 9.24 (Solution on p. 254.) 

Glenn makes five sales calls, with probabilities 0.37, 0.52, 0.48, 0.71, 0.63, of success on the 
respective calls. Margaret makes four sales calls with probabilities 0.77, 0.82, 0.75, 0.91, of success 
on the respective calls. Assume that all nine events form an independent class. If Glenn realizes 
a profit of $18.00 on each sale and Margaret earns $20.00 on each sale, what is the probability 
Margaret's gain is at least $10.00 more than Glenn's? 

Exercise 9.25 (Solution on p. 255.) 

Mike and Harry have a basketball shooting contest. 

• Mike shoots 10 ordinary free throws, worth two points each, with probability 0.75 of success 
on each shot. 

• Harry shoots 12 "three point" shots, with probability 0.40 of success on each shot. 

Let X, Y be the number of points scored by Mike and Harry, respectively. Determine P (X > 15), 
and P{Y > 15) , P (X > Y). 

Exercise 9.26 (Solution on p. 255.) 

Martha has the choice of two games. 

Game 1: Pay ten dollars for each "play." If she wins, she receives $20, for a net gain of $10 on the 

play; otherwise, she loses her $10. The probability of a win is 1/2, so the game is "fair." 
Game 2: Pay five dollars to play; receive $15 for a win. The probability of a win on any play is 

1/3. 

Martha has $100 to bet. She is trying to decide whether to play Game 1 ten times or Game 2 
twenty times. Let Wl and W2 be the respective net winnings (payoff minus fee to play). 

• Determine P (W2 > Wl). 

• Compare the two games further by calculating P (Wl > 0) and P (W2 > 0) 

Which game seems preferable? 

Exercise 9.27 (Solution on p. 256.) 

Jim and Bill of the men's basketball team challenge women players Mary and Ellen to a free throw 
contest. Each takes five free throws. Make the usual independence assumptions. Jim, Bill, Mary, 
and Ellen have respective probabilities p = 0.82,0.87,0.80, and 0.85 of making each shot tried. 
What is the probability Mary and Ellen make a total number of free throws at least as great as the 
total made by the guys? 
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Solutions to Exercises in Chapter 9 

Solution to Exercise 9.1 (p. 242) 



npr08_06 (Section~17. 8 . 37: npr08_06) 
Data are in X, Y, P 
itest 

Enter matrix of joint probabilities P 
The pair {X,Y} is NOT independent 
To see where the product rule fails, call for D 
disp(D) 

11 

11 

11111 
11111 

Solution to Exercise 9.2 (p. 243) 



npr09_02 (Section~17. 8 .41 : npr09_02) 
Data are in X, Y, P 
itest 

Enter matrix of joint probabilities P 
The pair {X,Y} is NOT independent 
To see where the product rule fails, call for D 
disp(D) 



110 

110 



Solution to Exercise 9.3 (p. 243) 



npr08_07 (Section~17. 8 . 38: npr08_07) 

Data are in X, Y, P 

itest 

Enter matrix of joint probabilities P 

The pair {X,Y> is NOT independent 

To see where the product rule fails, call for D 

disp(D) 

111111 
111111 
111111 
111111 

Solution to Exercise 9.4 (p. 243) 

Not independent by the rectangle test. 

tuappr 

Enter matrix [a b] of X-range endpoints [-1 1] 

Enter matrix [c d] of Y-range endpoints [-1 1] 

Enter number of X approximation points 100 
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Enter number of Y approximation points 100 

Enter expression for joint density (l/pi)*(t.~2 + u.~2<=l) 

Use array operations on X, Y, PX, PY, t, u, and P 

itest 

Enter matrix of joint probabilities P 

The pair {X,Y} is NOT independent 

To see where the product rule fails, call for D '/, Not practical-- too large 

Solution to Exercise 9.5 (p. 243) 

Not independent, by the rectangle test. 

tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density (1/2) * (u<=min(l+t ,3-t) ) . * ... 

(u>=max(l-t,t-l)) 
Use array operations on X, Y, PX, PY, t, u, and P 
itest 

Enter matrix of joint probabilities P 
The pair {X,Y} is NOT independent 
To see where the product rule fails, call for D 

Solution to Exercise 9.6 (p. 243) 

From the solution for Exercise 12 (Exercise 8.12) from "Problems on Random Vectors and Joint Distribu- 
tions" we have 

fx(t) = 2t, < i< 1, f Y (u) = 2(l-u), 0<u<l, fxY = fxfr (9.35) 

so the pair is independent. 

tuappr 
Enter matrix [a b] of X-range endpoints [0 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 100 
Enter number of Y approximation points 100 
Enter expression for joint density 4*t.*(l-u) 
Use array operations on X, Y, PX, PY, t, u, and P 



itest 

Enter matrix of joint probabilities P 

The pair {X,Y} is independent 

Solution to Exercise 9.7 (p. 243) 

From the solution of Exercise 13 (Exercise 8.13) from "Problems on Random Vectors and Joint Distributions" 
we have 

fx{t) = f Y {t) = \{t+l), 0<t<2 (9.36) 

so fxY 7^ fxfy which implies the pair is not independent. 
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tuappr 
Enter matrix [a b] of X-range end/points [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 100 
Enter number of Y approximation points 100 
Enter expression for joint density (l/8)*(t+u) 
Use array operations on X, Y, PX, PY, t, u, and P 
itest 

Enter matrix of joint probabilities P 
The pair {X,Y> is NOT independent 
To see where the product rule fails, call for D 

Solution to Exercise 9.8 (p. 243) 

From the solution for Exercise 14 (Exercise 8.14) from "Problems on Random Vectors and Joint Distribution" 
we have 

f x (t) = 2e~ 2t , 0<t, f Y (u) = 2u, 0<u<l (9.37) 

so that fxY = fxfy and the pair is independent. 

tuappr 
Enter matrix [a b] of X-range endpoints [0 5] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 500 
Enter number of Y approximation points 100 
Enter expression for joint density 4*u. *exp(-2*t) 
Use array operations on X, Y, PX, PY, t, u, and P 
itest 

Enter matrix of joint probabilities P 
The pair {X,Y} is independent '/, Product rule holds to within 10~{-9} 

Solution to Exercise 9.9 (p. 243) 
Not independent by the rectangle test. 

tuappr 
Enter matrix [a b] of X-range endpoints [-1 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 200 
Enter number of Y approximation points 100 
Enter expression for joint density 12*t . ~2. *u. *(u<=min(t+l, 1) ) . * ... 

(u>=max(0,t)) 
Use array operations on X, Y, PX, PY, t, u, and P 
itest 

Enter matrix of joint probabilities P 
The pair {X,Y> is NOT independent 
To see where the product rule fails, call for D 

Solution to Exercise 9.10 (p. 244) 

By the rectangle test, the pair is not independent. 

tuappr 

Enter matrix [a b] of X-range endpoints [0 2] 

Enter matrix [c d] of Y-range endpoints [0 1] 

Enter number of X approximation points 200 
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Enter number of Y approximation points 100 

Enter expression for joint density (24/ll)*t . *u. * (u<=min(l,2-t)) 

Use array operations on X, Y, PX, PY, t, u, and P 

itest 

Enter matrix of joint probabilities P 

The pair {X,Y> is NOT independent 

To see where the product rule fails, call for D 

Solution to Exercise 9.11 (p. 244) 

pi = 1 - exp(-180/150) 
pi = 0.6988 
p2 = 1 - exp(-180/130) 
p2 = 0.7496 
Pboth = pl*p2 
Pboth = 0.5238 

Poneormore = 1 - (1 - pl)*(l - p2) 7. 1 - Pneither 
Poneormore = 0.9246 
Pneither = (1 - pl)*(l - p2) 
Pneither = 0.0754 

Solution to Exercise 9.12 (p. 244) 

p = exp(-500/750) ; 7. Probability any one will survive 
P = cbinom(8,p,5) 7. Probability five or more will survive 
P = 0.3930 

Solution to Exercise 9.13 (p. 244) 

Geometrically, p = 3/4, so that P = cbinom(10,p,3) = 0.9996. 
Solution to Exercise 9.14 (p. 244) 



p = exp(- 


750/10000) 


p = 0.9277 




k = 175:5:190 


J 


P = cbinom(200,p,k) ; 


disp([k;P]') 




175.0000 


0.9973 


180.0000 


0.9449 


185.0000 


0.6263 


190.0000 


0.1381 



Solution to Exercise 9.15 (p. 244) 

p = exp(-720/3000) 
p = 0.7866 7. Probability any unit survives 

P = p~12 7. Probability all twelve survive (assuming 12 periods) 

P = 0.056 

Solution to Exercise 9.16 (p. 244) 

X = 570/4 + 525/ B + 465/ c with [P {A) P (B) P (C)] = [0.570.410.35]. Y = 1505, where S ~ binomial 
0.57). 
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c = [570 525 465 0] ; 

pm = minprob([0.57 0.41 0.35]); 

canonic '/, Distribution for X 

Enter row vector of coefficients c 
Enter row vector of minterm probabilities pm 

Use row matrices X and PX for calculations 

Call for XDBN to view the distribution 

Y = 150* [0:8]; '/. Distribution for Y 

PY = ibinom(8, 0.57, 0:8); 

icalc '/, Joint distribution 

Enter row matrix of X-values X 

Enter row matrix of Y-values Y 

Enter X probabilities PX 

Enter Y probabilities PY 
Use array operations on matrices X, Y, PX, PY, t, u, and P 

xmax = max(X) 

xmax = 1560 

ymax = max(Y) 

ymax = 1200 

k = [600 900 1000 1100] ; 

px = zeros(l ,4) ; 



for i = 1:4 

px(i) = (X>=k(i))*PX>; 
end 

py = zeros(l ,4) ; 
for i = 1:4 

py(i) = (Y>=k(i))*PY>; 
end 
disp([px;py] ') 

0.4131 0.7765 

0.4131 0.2560 

0.3514 0.0784 

0.0818 0.0111 

M = u > t; 
PM = total(M.*P) 
PM = 0.5081 '/. P(Y>X) 

Solution to Exercise 9.17 (p. 244) 

ex = [5 17 21 8 15 0] ; 
pmx = minprob(0.01*[37 22 38 81 63]); 
cy = [8 15 12 18 15 12 0] ; 
pmy = minprob(0.01*[77 52 23 41 83 58]); 
[X,PX] = canonicf (ex, pmx) ; 
[Y,PY] = canonicf (cy, pmy) ; 
icalc 

Enter row matrix of X-values X 
Enter row matrix of Y-values Y 
Enter X probabilities PX 
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Enter Y probabilities PY 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
Ml = u >= 2*t; 
PM1 = total (Ml. *P) 
PM1 = 0.3448 
M2 = u - t >=30; 
PM2 = total (M2.*P) 
PM2 = 0.2431 

Solution to Exercise 9.18 (p. 245) 

ex = [310 380 350 0] ; 
pmx = minprob(0.01*[35 41 57]); 
Y = 100* [0:8] ; 
PY = ibinom(8, 0.57, 0:8); 
canonic 
Enter row vector of coefficients ex 
Enter row vector of minterm probabilities pmx 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 
icalc 

Enter row matrix of X-values X 
Enter row matrix of Y-values Y 
Enter X probabilities PX 
Enter Y probabilities PY 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
xmax = max(X) 
xmax = 1040 
ymax = max(Y) 
ymax = 800 
PYgX = total ( (u>t) . *P) 
PYgX = 0.5081 
k = [600 700 750] ; 
px = zeros(l ,3) ; 
py = zeros(l ,3) ; 
for i = 1:3 

px(i) = (X>=k(i))*PX'; 
end 
for i = 1:3 

py(i) = (Y>=k(i))*PY'; 
end 
disp([px;py] ') 

0.4131 0.2560 

0.2337 0.0784 

0.0818 0.0111 

Solution to Exercise 9.19 (p. 245) 

Nl = 0:30; 
PN1 = ibinom(30, 0.05, 0:30); 
x = 150 - 20*N1; 
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[X,PX] = csort(x,PNl); 

N2 = 0:50; 

PN2 = ibinom(50, 0.2,0: 50) ; 

y = 500 - 30*N2; 

[Y,PY] = csort(y,PN2); 

icalc 

Enter row matrix of X-values X 

Enter row matrix of Y-values Y 

Enter X probabilities PX 

Enter Y probabilities PY 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
G = t + u; 
Mlose = G < 0; 
Mm400 = G >= 400; 
M1200 = G < 200; 
M200_450 = (G>=200)&(G<=450); 
Plose = total (Mlose. *P) 
Plose = 3.5249e-04 
Pm400 = total (Mm400.*P) 
Pm400 = 0.1957 
P1200 = total (M1200.*P) 
P1200 = 

0.0828 
P200_450 = total (M200_450 . *P) 
P200_450 = 0.8636 

Solution to Exercise 9.20 (p. 245) 

Since icalc uses X and PX in its output, we avoid a renaming problem by using x and px for data vectors 
X and PX. 

x = [-5-13 4 7]; 
px = 0.01* [15 20 30 25 10] ; 
icalc 

Enter row matrix of X-values 3*x 
Enter row matrix of Y-values -4*x 
Enter X probabilities px 
Enter Y probabilities px 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
a = t + u; 

[V,PV] = csort(a,P); 
icalc 

Enter row matrix of X-values V 
Enter row matrix of Y-values 2*x 
Enter X probabilities PV 
Enter Y probabilities px 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
b = t + u; 
[W,PW] = csort(b,P); 
PI = (W>0)*PW 
PI = 0.5300 

P2 = ((-20<=W)&(W<=10))*PV 
P2 = 0.5514 
icalc3 °/. Alternate using icalc3 
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Enter row matrix of X-values x 

Enter row matrix of Y-values x 

Enter row matrix of Z-values x 

Enter X probabilities px 

Enter Y probabilities px 

Enter Z probabilities px 

Use array operations on matrices X, Y, Z, 

PX, PY, PZ, t, u, v, and P 

a = 3*t - 4*u + 2*v; 

[W,PW] = csort(a,P) ; 

PI = (W>0)*PW 

PI = 0.5300 

P2 = ((-20<=W)&(W<=10))*PV 

P2 = 0.5514 

Solution to Exercise 9.21 (p. 245) 

ex = [3-9 4 0]; 
pmx = minprob(0.01*[42 27 33]); 
cy =[-2 6 2 -3] ; 
pmy = minprob(0.01*[47 37 41]); 
[X,PX] = canonicf (ex, pmx) ; 
[Y,PY] = canonicf (cy, pmy) ; 
icalc 

Enter row matrix of X-values X 
Enter row matrix of Y-values Y 
Enter X probabilities PX 
Enter Y probabilities PY 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
G = 2*t - 3*u; 
[Z,PZ] = csort(G,P) ; 
PYgX = total ( (u>t) .*P) 
PYgX = 0.3752 
PZpos = (Z>0)*PZ' 
PZpos = 0.5654 

P5Z25 = ((5<=Z)&(Z<=25))*PZ' 
P5Z25 = 0.4745 

Solution to Exercise 9.22 (p. 245) 

P = (ibinom(30,l/6,0:29))*(cbinom(30, 1/6, 1:30))' = 0.4307 
Solution to Exercise 9.23 (p. 245) 

pg = (ibinom(20, 1/2, : 19)) * (cbinom(20, 1/2, 1:20))' 
pg = 0.4373 '/, Probability each girl throws more 

P = cbinom(15,pg,8) 
P = 0.3100 °/. Probability eight or more girls throw more 

Solution to Exercise 9.24 (p. 246) 



eg = [18*ones(l,5) 0] ; 
= [20*ones(l,4) 0] ; 
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pmg = minprob(0.01*[37 52 48 71 63]); 

pmm = minprob(0.01*[77 82 75 91]); 

[G,PG] = canonicf (eg, pmg) ; 

[M,PM] = canonicf (cm, pmm) ; 

icalc 

Enter row matrix of X-values G 

Enter row matrix of Y-values M 

Enter X probabilities PG 

Enter Y probabilities PM 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
H = u-t>=10; 
pi = total(H.*P) 
pi = 0.5197 

Solution to Exercise 9.25 (p. 246) 

X = 2* [0:10] ; 
PX = ibinom(10, 0.75, 0:10); 
Y = 3* [0:12] ; 

PY = ibinom(12, 0.40, 0:12); 
icalc 

Enter row matrix of X-values X 
Enter row matrix of Y-values Y 
Enter X probabilities PX 
Enter Y probabilities PY 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
PX15 = (X>=15)*PX' 
PX15 = 0.5256 
PY15 = (Y>=15)*PY' 
PY15 = 0.5618 
G = t>=u; 
PG = total(G.*P) 
PG = 0.5811 

Solution to Exercise 9.26 (p. 246) 

Wl = 20* [0:10] - 100; 
PW1 = ibinom(10, 1/2, 0:10); 
W2 = 15* [0:20] - 100; 
PW2 = ibinom(20, 1/3, 0:20); 
Plpos = (W1>0)*PW1' 
Plpos = 0.3770 
P2pos = (W2>0)*PW2' 
P2pos = 0.5207 
icalc 

Enter row matrix of X-values Wl 
Enter row matrix of Y-values W2 
Enter X probabilities PW1 
Enter Y probabilities PW2 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
G = u >= t; 



256 CHAPTER 9. INDEPENDENT CLASSES OF RANDOM VARIABLES 

PG = total (G.*P) 
PG = 0.5182 

Solution to Exercise 9.27 (p. 246) 

x = 0:5; 
PJ = ibinom(5,0.82,x) ; 
PB = ibinom(5,0.87,x) ; 
PM = ibinom(5,0.80,x) ; 
PE = ibinom(5,0.85,x) ; 

icalc 

Enter row matrix of X-values x 
Enter row matrix of Y-values x 
Enter X probabilities PJ 
Enter Y probabilities PB 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
H = t+u; 

[Tm.Pm] = csort(H,P) ; 
icalc 

Enter row matrix of X-values x 
Enter row matrix of Y-values x 
Enter X probabilities PM 
Enter Y probabilities PE 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
G = t+u; 

[Tw,Pw] = csort(G,P) ; 
icalc 

Enter row matrix of X-values Tm 
Enter row matrix of Y-values Tw 
Enter X probabilities Pm 
Enter Y probabilities Pw 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
Gw = u>=t; 
PGw = total (Gw.*P) 
PGw = 0.5746 

icalc4 °/. Alternate using icalc4 

Enter row matrix of X-values x 

Enter row matrix of Y-values x 

Enter row matrix of Z-values x 

Enter row matrix of W-values x 

Enter X probabilities PJ 

Enter Y probabilities PB 

Enter Z probabilities PM 

Enter W probabilities PE 

Use array operations on matrices X, Y, Z,W 

PX, PY, PZ, PW t, u, v, w, and P 

H = v+w >= t+u; 

PH = total(H.*P) 

PH = 0.5746 



Chapter 10 

Functions of Random Variables 

10.1 Functions of a Random Variable 1 

Introduction 

Frequently, we observe a value of some random variable, but are really interested in a value derived 
from this by a function rule. If X is a random variable and g is a reasonable function (technically, a Borel 
function), then Z = g (X) is a new random variable which has the value g (t) for any uj such that X (u>) = t. 
Thus Z{uj) =g(X(u>)). 

10.1.1 The problem; an approach 

We consider, first, functions of a single random variable. A wide variety of functions are utilized in practice. 

Example 10.1: A quality control problem 

In a quality control check on a production line for ball bearings it may be easier to weigh the balls 
than measure the diameters. If we can assume true spherical shape and w is the weight, then 
diameter is kw 1 ^ 3 , where k is a factor depending upon the formula for the volume of a sphere, the 
units of measurement, and the density of the steel. Thus, if X is the weight of the sampled ball, 
the desired random variable is D = kX 1 ^. 

Example 10.2: Price breaks 

The cultural committee of a student organization has arranged a special deal for tickets to a 
concert. The agreement is that the organization will purchase ten tickets at $20 each (regardless 
of the number of individual buyers). Additional tickets are available according to the following 
schedule: 



• 11-20, $18 each 

• 21-30, $16 each 

• 31-50, $15 each 

• 51-100, $13 each 



If the number of purchasers is a random variable X, the total cost (in dollars) is a random quantity 
Z = g (X) described by 

g {X) = 200 + 18/mi (X) {X - 10) + (16 - 18) J M2 (X) (X - 20) (10.1) 

+ (15 - 16) I M3 {X) {X - 30) + (13 - 15) J M4 {X) {X - 50) (10.2) 



1 This content is available online at <http://cnx.Org/content/m23329/l.5/>. 
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where Ml = [10, oo) , M2 = [20, oo) , M3 = [30, oo) , M4 = [50, oo) (10.3) 

The function rule is more complicated than in Example 10.1 (A quality control problem), but the 
essential problem is the same. 

The problem 

If X is a random variable, then Z = g (X) is a new random variable. Suppose we have the distribution 
for X. How can we determine P (Z € M), the probability Z takes a value in the set M? 
An approach to a solution 
We consider two equivalent approaches 

a. To find P {X £ M). 

a. Mapping approach. Simply find the amount of probability mass mapped into the set M by the 
random variable X. 

• In the absolutely continuous case, calculate J M fx- 

• In the discrete case, identify those values t,- of X which are in the set M and add the associated 
probabilities. 

b. Discrete alternative. Consider each value t; of X. Select those which meet the defining conditions 
for M and add the associated probabilities. This is the approach we use in the MATLAB calcu- 
lations. Note that it is not necessary to describe geometrically the set M; merely use the defining 
conditions. 

b. To find P{g{X) e M). 

a. Mapping approach. Determine the set JV of all those t which are mapped into M by the function 
g. Now if X (w) e N, then g (X (w)) G M, and if g (X (w)) G M, then X (w) G N. Hence 



[uj : g (X (w)) e M} = {uj : X {uj) e N} 



(10.4) 



Since these are the same event, they must have the same probability. Once N is identified, 
determine P (X e X) in the usual manner (see part a, above), 
b. Discrete alternative. For each possible value t; of X, determine whether g (ti) meets the defining 
condition for M. Select those t,- which do and add the associated probabilities. 

□ 

Remark. The set N in the mapping approach is called the inverse image N = g~ l (M). 

Example 10.3: A discrete example 

Suppose X has values -2, 0, 1, 3, 6, with respective probabilities 0.2, 0.1, 0.2, 0.3 0.2. 

Consider Z = g (X) = (X + 1) (X - 4). Determine P (Z > 0). 

SOLUTION 

First solution. The mapping approach 

g (t) = (t + 1) (t - 4). N = {t : g (t) > 0} is the set of points to the left of -1 or to the right of 
4. The X- values —2 and 6 lie in this set. Hence 



P(g(X)>0) = P(X-- 
Second solution. The discrete alternative 



-2) + P (X = 6) = 0.2 + 0.2 = 0.4 



(10.5) 





X = -2 





1 


3 


6 




PX = 0.2 


0.1 


0.2 


0.3 


0.2 




Z= 6 


-4 


-6 


-4 


14 




Z > 1 











1 
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Table 10.1 

Picking out and adding the indicated probabilities, we have 

P(Z > 0) = 0.2 + 0.2 = 0.4 (10.6) 

In this case (and often for "hand calculations") the mapping approach requires less calculation. 
However, for MATLAB calculations (as we show below), the discrete alternative is more readily 
implemented. 

Example 10.4: An absolutely continuous example 

Suppose X ~ uniform [—3,7]. Then fx (t) = 0.1, —3 < t < 7 (and zero elsewhere). Let 

Z = g(X) = (X + l)(X-A) (10.7) 

Determine P (Z > 0). 
SOLUTION 

First we determine N = {t : g (t) > 0}. As in Example 10.3 (A discrete example), g (t) = 
(t + 1) (t — 4) > for t < — 1 or t > 4. Because of the uniform distribution, the integral of the 
density over any subinterval of [—3, 7] is 0.1 times the length of that subinterval. Thus, the desired 
probability is 

P (g (X) > 0) = 0.1 [(-1 - (-3)) + (7 - 4)] = 0.5 (10.8) 

We consider, next, some important examples. 

Example 10.5: The normal distribution and standardized normal distribution 

To show that if X ~ N (/x, a 2 ) then 

Z = g(X) = ^^ ~ AT (0,1) (10.9) 

a 

VERIFICATION 

We wish to show the denity function for Z is 

^(t) = ^ e -' 2 / 2 (10.10) 

V 27T 

Now 

g{t)=^-^<v iff t<av + fi (10.11) 

a 

Hence, for given M = (—00, v] the inverse image is N = (—00, av + /i], so that 

F z {v) = P{Z <v) = P{Z eM) = P{X eN) = P{X <av + fi) = F x {crv + fi) (10.12) 

Since the density is the derivative of the distribution function, 

fz (v) = F z (v) = F x (av + n)a= af x {av + /i) (10.13) 

Thus 



fz (v) = — i^=exp 

<7V27T 



We conclude that Z ~ N (0, 1) 



U*-^^) 2 = ^=e-^ 2 = Hv) (10.14) 
2 \ a J V27T 



260 



CHAPTER 10. FUNCTIONS OF RANDOM VARIABLES 



Example 10.6: Afflne functions 

Suppose X has distribution function Fx- If it is absolutely continuous, the corresponding density is 
fx- Consider Z = aX + b (o / 0). Here g (t) = at + b, an affine function (linear plus a constant). 
Determine the distribution function for Z (and the density in the absolutely continuous case). 
SOLUTION 



F z (v) = P(Z <v) = P(aX + b< v) 



There are two cases 
• a > 0: 



F z (v)=P[X< 



v — b 



• a < 



So that 



F z (v) = P[X> 



v — b 



F z (v) = l- F x 



P X > 



F x 



v — b 



v — b 



P X 



v — b 



v — b 



P X 



v — b 



For the absolutely continuous case, P (X = — -) = 0, and by differentiation 



fora>0 f z ( v ) = kf x (*=k) 
fora<0 f z{v ) = -lf x (^) 



Since for a < 0, — a = \a\, the two cases may be combined into one formula. 



JZ (V) = T-rfx 

\a\ \ a 



(10.15) 

(10.16) 

(10.17) 
(10.18) 



(10.19) 



Example 10.7: Completion of normal and standardized normal relationship 

Suppose Z ~ N (0, 1). Show that X = aZ + fi {a > 0) is N (/x, er 2 ) . 
VERIFICATION 
Use of the result of Example 10.6 (AfHne functions) on affme functions shows that 



/*(') 






1 



<tV2tt 



exp 



1 ft- ix 

2 



(10.20) 



Example 10.8: Fractional power of a nonnegative random variable 

Suppose X > and Z = g(X) = X x / a for a > 1. Since for t > 0, t 1 / 11 is increasing, we have 
< t x l a < v iff < t < v a . Thus 



F z (v) = P{Z <v) = P{X < v a ) = F x {v a ) 
In the absolutely continuous case 

f z (v) = F z (v) = f x (v a )av a - 1 



(10.21) 
(10.22) 



Example 10.9: Fractional power of an exponentially distributed random variable 

Suppose X ~ exponential (A). Then Z = X Y l a ~ Weibull (a, A, 0). 
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According to the result of Example 10.8 (Fractional power of a nonnegative random variable), 

F z (t) = F x (t a ) = 1 - e- xta (10.23) 

which is the distribution function for Z ~ Weibull (a, A, 0). 

Example 10.10: A simple approximation as a function of X 

If X is a random variable, a simple function approximation may be constructed (see Distribution 
Approximations). We limit our discussion to the bounded case, in which the range of X is limited to 
a bounded interval I = [a, b}. Suppose I is partitioned into n subintervals by points <;,-, 1 < i < n— 1, 
with a = to and b = t n . Let M, = [£j_i,£,) be the ith subinterval, 1 < i < n — 1 and M n = [t n -i, t n }. 
Let Ei = X -1 (Mi) be the set of points mapped into M; by X. Then the E; form a partition of the 
basic space Q,. For the given subdivision, we form a simple random variable X s as follows. In each 
subinterval, pick a point Sj,ij_i < s, < £j. The simple random variable 



X s = Y, ^E t (10-24) 



i=i 



approximates X to within the length of the largest subinterval M;. Now Ie 4 = I Mi (X), since 
I Ei {uj) = 1 iff X (w) G Mj iff J Mi (X (w)) = 1. We may thus write 



X s = J^ s 4 / M . (X) , a function of X (10.25) 



10.1.2 Use of MATLAB on simple random variables 

For simple random variables, we use the discrete alternative approach, since this may be implemented easily 
with MATLAB. Suppose the distribution for X is expressed in the row vectors X and PX. 

• We perform array operations on vector X to obtain 

G=[g(h) g(t 2 ) ••• g(t n )} (10.26) 



• 



• 



We use relational and logical operations on G to obtain a matrix M which has ones for those t; (values 
of X) such that g (i,-) satisfies the desired condition (and zeros elsewhere). 

The zero-one matrix M is used to select the the corresponding p^ = P (X = tj) and sum them by the 
taking the dot product of M and PX. 

Example 10.11: Basic calculations for a function of a simple random variable 

X = -5:10; '/. Values of X 

PX = ibinom(15, 0.6, 0:15); '/. Probabilities for X 

G = (X + 6).*(X - 1).*(X - 8); '/. Array operations on X matrix to get G = g(X) 
M = (G > - 100) &(G < 130); '/. Relational and logical operations on G 

PM = M*PX' '/, Sum of probabilities for selected values 
PM = 0.4800 

disp( [X;G;M;PX] ' ) '/, Display of various matrices (as columns) 

-5.0000 78.0000 1.0000 0.0000 

-4.0000 120.0000 1.0000 0.0000 

-3.0000 132.0000 0.0003 
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-2.0000 


120.0000 


1 


.0000 




0.0016 


-1.0000 


90.0000 


1 


.0000 




0.0074 





48.0000 


1 


.0000 




0.0245 


1.0000 





1 


.0000 




0.0612 


2.0000 


-48.0000 


1 


.0000 




0.1181 


3.0000 


-90.0000 


1 


.0000 




0.1771 


4.0000 


-120.0000 









0.2066 


5.0000 


-132.0000 









0.1859 


6.0000 


-120.0000 









0.1268 


7.0000 


-78.0000 


1 


.0000 




0.0634 


8.0000 





1 


.0000 




0.0219 


9.0000 


120.0000 


1 


.0000 




0.0047 


10.0000 


288.0000 









0.0005 


[Z,PZ] = cs 


ort(G.PX); 






'/. 


Sorting and consolidating to obtain 


disp([Z;PZ] 


') 






'/. 


the distribution for Z = g(X) 


-132.0000 


0.1859 










-120.0000 


0.3334 










-90.0000 


0.1771 










-78.0000 


0.0634 










-48.0000 


0.1181 













0.0832 










48.0000 


0.0245 










78.0000 


0.0000 










90.0000 


0.0074 










120.0000 


0.0064 










132.0000 


0.0003 










288.0000 


0.0005 










PI = (G<-120)*PX ' 




'/. 


Further calculation using G, PX 


PI = 0.1859 










pi = (Z<-120)*PZ' 




'/. 


Alternate using Z, PZ 


pi = 0.1859 











Example 10.12 

X = 10I A + 1Mb + WIc with {A, B, C} independent and P = [0.60.30.5]. 
We calculate the distribution for X, then determine the distribution for 



X 1/2 - X + 50 



(10.27) 



c = [10 18 10 0] ; 
pm = minprob(0 . 1* [6 3 5]); 
canonic 
Enter row vector of coefficients c 
Enter row vector of minterm probabilities pm 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 
disp(XDBN) 

0.1400 
10.0000 0.3500 
18.0000 0.0600 
20.0000 0.2100 
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28.0000 0.1500 
38.0000 0.0900 

G = sqrt(X) - X + 50; 

[Z,PZ] = csort (G,PX); 

disp([Z;PZ]') 



7. Formation of G matrix 

7. Sorts distinct values of g(X) 

'/, consolidates probabilities 



18.1644 


0.0900 


27.2915 


0.1500 


34.4721 


0.2100 


36.2426 


0.0600 


43.1623 


0.3500 


50.0000 


0.1400 


M = (Z < 20) I (Z >= 40) 


M = 1 





PZM = M*PZ' 




PZM = 0.5800 





'/, Direct use of Z distribution 

1 1 



Remark. Note that with the m-function csort, we may name the output as desired. 
Example 10.13: Continuation of Example 10.12, above. 



H = 2+X.-2 - 3*X + 1; 
[W,PW] = csort (H,PX) 

W = 1 171 595 

PW = 0.1400 0.3500 0.0600 



741 1485 2775 

0.2100 0.1500 0.0900 



Example 10.14: A discrete approximation 

Suppose X has density function f x (t) = \ (3£ 2 + 2i) for < t < 1. Then F x (t) = |(t 3 + t 2 ). Let 
Z = X 1 ' 2 . We may use the approximation m-procedure tappr to obtain an approximate discrete 
distribution. Then we work with the approximating random variable as a simple random variable. 
Suppose we want P (Z < 0.8). Now Z < 0.8 iff X < 0.8 2 = 0.64. The desired probability may be 
calculated to be 



P (Z < 0.8) = F x (0.64) = (0.64 3 + 0.64 2 ) /2 = 0.3359 
Using the approximation procedure, we have 

tappr 
Enter matrix [a b] of x-range endpoints [0 1] 
Enter number of x approximation points 200 
Enter density as a function of t (3*t.~2 + 2*t)/2 
Use row matrices X and PX as in the simple case 
G = X.~(l/2); 
M = G <= 0.8; 
PM = M*PX' 
PM = 0.3359 7» Agrees quite closely with the theoretical 



(10.28) 



10.2 Function of Random Vectors 2 

Introduction 

2 This content is available online at <http://cnx.Org/content/m23332/l.5/>. 
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The general mapping approach for a single random variable and the discrete alternative extends to 
functions of more than one variable. It is convenient to consider the case of two random variables, considered 
jointly. Extensions to more than two random variables are made similarly, although the details are more 
complicated. 

10.2,1 The general approach extended to a pair 

Consider a pair {X, Y} having joint distribution on the plane. The approach is analogous to that for a single 
random variable with distribution on the line. 

a. To find P((X, Y) £ Q). 

a. Mapping approach. Simply find the amount of probability mass mapped into the set Q on the 
plane by the random vector W = (X, Y). 

• In the absolutely continuous case, calculate J JqIxy- 

• In the discrete case, identify those vector values (ti, Uj) of (X, Y) which are in the set Q and 
add the associated probabilities. 

b. Discrete alternative. Consider each vector value (ti, Uj) of (X, Y). Select those which meet the 
defining conditions for Q and add the associated probabilities. This is the approach we use in the 
MATLAB calculations. It does not require that we describe geometrically the region Q. 

b. To find P (g (X, Y) e M). g is real valued and M is a subset the real line. 

a. Mapping approach. Determine the set Q of all those (t, u) which are mapped into M by the 
function g. Now 

W (w) = [X (w) , Y (uj)) e Qittg ((X (uj) , Y (uj)) e MHence(10.29) 

{uj:g(X (uj) , Y (uj)) eM} = {uj:(X (uj) , Y («)) G Q] (10.30) 

Since these are the same event, they must have the same probability. Once Q is identified on the 
plane, determine P((X,Y) e Q) in the usual manner (see part a, above). 

b. Discrete alternative. For each possible vector value (ti, Uj) of (X, Y), determine whether g (ti, uj) 
meets the defining condition for M. Select those (ti,Uj) which do and add the associated proba- 
bilities. 

We illustrate the mapping approach in the absolutely continuous case. A key element in the approach is 
finding the set Q on the plane such that g (X, Y) s M iff (X, Y) e Q. The desired probability is obtained 
by integrating fxy over Q. 
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t= 2 



fyv (t,u) = (6/37)(t + 2u) 



Figure 10.1: Distribution for Example 10.15 (A numerical example). 



Example 10.15: A numerical example 

The pair {X, Y} has joint density fxY {t, u) = JL (t + 2m) on the region bounded by t = 0, t = 2, 
u = 0, u = max{l, t} (see Figure 1). Determine P (Y < X) = P {X - Y > 0). Here g(t,u) =t-u 
and M = [0, oo). Now Q = {(t, u) : t — u > 0} = {(£, u) : u < t} which is the region on the plane 
on or below the line u = t. Examination of the figure shows that for this region, fxy is different 
from zero on the triangle bounded by t = 2, u = 0, and u = t. The desired probability is 



P(Y < X) 



"'O 



G 
37 



(t + 2u) du dt = 32/37 w 0.8649 



(10.31) 



Example 10.16: The density for the sum X + Y 

Suppose the pair {X, Y} has joint density fxY- Determine the density for 



Z = X + Y 



SOLUTION 



(10.32) 



F z (v) = P(X + Y <v) = P{(X,Y) eQ v ) where Q v = {(£, u) : t + u < v} = (10.33) 
{(£, u) : u < v — t} 

For any fixed v, the region Q v is the portion of the plane on or below the line u = v — t (see 
Figure 10.2). Thus 



F z {v) 



./: 



XY 



fxY {t, u) dudt 



(10.34) 



oo J — OO 
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Differentiating with the aid of the fundamental theorem of calculus, we get 

fz (v) = / fxY (t, V-t)dt 

This integral expresssion is known as a convolution integral. 



(10.35) 



Q v = {(t,u): u<=v-t} 




Figure 10.2: Region Q v for X + Y < v. 



Example 10.17: Sum of joint uniform random variables 

Suppose the pair {X, Y} has joint uniform density on the unit square < t < 1, < w < 1. 
Determine the density for Z = X + Y. 

SOLUTION 

Fz (v) is the probability in the region Q v : u < v — t. Now P%Y {Qv) = 1 — Pxy (Qv)' w h ere 
the complementary set Q v c is the set of points above the line. As Figure 3 shows, for v < 1, the 
part of Q v which has probability mass is the lower shaded triangular region on the figure, which has 
area (and hence probability) v 2 /2. For v > 1, the complementary region Q v c is the upper shaded 
region. It has area (2 — v) /2. so that in this case, 



Pxy (Qv) = 1 - (2 - v y/2. Thus, 



F z (v) 



for < v < 1 and F z (v) = 1 



(2-vY 



for 1 < v < 2 



2 w 2 

Differentiation shows that Z has the symmetric triangular distribution on [0, 2], since 

f z (v) =v for < v < 1 and f z (v) = (2 - v) for 1 < v < 2 
With the use of indicator functions, these may be combined into a single expression 

fz (v) = I[ ,i] (v) v + J (li2 ] (v) (2 - v) 



(10.36) 



(10.37) 



(10.38) 
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u = v - 1, for v > 1 




u = v - 1, for v <= 1 



Figure 10.3: Geometry for sum of joint uniform random variables. 



ALTERNATE SOLUTION 

Since f X y (t, u) = I[ , i] (t) I[ , i] (u), we have f X y (t, v - t) = J [0i i] (t) J[ , i] (v-t). Now < 



t < 1 iff 



1 < t < v, so that 



fxY (t, V-t)= J [0] !] ( W ) / [0| u] (t) + / (1 , 2] (*) /[«_!, !] (t) (10.39) 

Integration with respect to t gives the result above. 

Independence of functions of independent random variables 

Suppose {X, Y} is an independent pair. Let Z = g (X) , W = h (Y). Since 

Z- 1 (M) = X^ 1 [g- 1 (M)] and W~ l (N) = Y^ 1 [h^ 1 (N) (10.40) 

the pair {Z^ 1 (M) , W~ l (N)} is independent for each pair {M, N}. Thus, the pair {Z, W} is independent. 
If {X,Y} is an independent pair and Z = g (X) , W = h(Y), then the pair {Z,W} is independent. 
However, if Z = g (X, Y) and W = h (X, Y), then in general {Z, W} is not independent. This is illustrated 
for simple random variables with the aid of the m-procedure jointzw at the end of the next section. 

Example 10.18: Independence of simple approximations to an independent pair 

Suppose {X, Y} is an independent pair with simple approximations X s and Y s as described in 
Distribution Approximations. 



x s = y, tii Ei = J2 I* 1 * ( x ) and Ys = Y^ u i If i = zZ v^ en 

i— 1 z—1 j — 1 J — 1 



(10.41) 



As functions of X and Y, respectively, the pair {X S ,Y S } is independent. Also each pair 
{ijvfj (X) ,In [Y)} is independent. 
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10.2.2 Use of MATLAB on pairs of simple random variables 

In the single- variable case, we use array operations on the values of X to determine a matrix of values of 
g (X). In the two- variable case, we must use array operations on the calculating matrices t and u to obtain 
a matrix G whose elements are g(U, Uj). To obtain the distribution for Z = g (X, Y), we may use the 
m-function csort on G and the joint probability matrix P. A first step, then, is the use of jcalc or icalc to 
set up the joint distribution and the calculating matrices. This is illustrated in the following example. 

Example 10.19 



'/, file jdemo3.m 
'/, data for joint simple distribution 



X = [-4 -2 1 


3]; 










Y = [0124]; 












P = [0.0132 


0.0198 




0.0297 


0.0209 


0.0264; 


0.0372 


0.0558 




0.0837 


0.0589 


0.0744; 


0.0516 


0.0774 




0.1161 


0.0817 


0.1032; 


0.0180 


0.0270 




0.0405 


0.0285 


0.0360]; 


jdemo3 




'/. 


Call for 


data 




jcalc 




'/. 


Set up ol 


calculat. 


mg matrices t, u. 


Enter JOINT PROBABILITIES (as on 


the plane] 


P 



Enter row matrix of VALUES of X X 

Enter row matrix of VALUES of Y Y 
Use array operations on matrices X, Y, PX, PY, t, u, and P 

'/, Formation of G = [g(ti,uj)] 
'/, Calculation using the XY distribution 
'/. Alternately, use total ( (G>=1) . *P) 



'/, Calculation using the Z distribution 
'/, Display of the Z distribution 



G = 


= t.~2 -3*u 


9 


M « 


= G >= 1; 




PM 


= total (M. 


*P) 


PM 


= 0.4665 




[Z 


PZ] = csort (G,P) ; 


PM 


= (Z>=1)*PZ' 


PM 


= 0.4665 




disp([Z;PZ]') 






-12.0000 


0.0297 




-11.0000 


0.0209 




-8.0000 


0.0198 




-6.0000 


0.0837 




-5.0000 


0.0589 




-3.0000 


0.1425 




-2.0000 


0.1375 







0.0405 




1.0000 


0.1059 




3.0000 


0.0744 




4.0000 


0.0402 




6.0000 


0.1032 




9.0000 


0.0360 




10.0000 


0.0372 




13.0000 


0.0516 




16.0000 


0.0180 



We extend the example above by considering a function W = h (X, Y) which has a composite definition. 
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Example 10.20: Continuation of Example 10.19 

Let 



W = { 



X for X + Y > 1 

X 2 + Y 2 for X + Y < 1 



Determine the distribution for W 



(10.42) 



H = t.*(t+u>=l) + (t.~2 + u.-2) .*(t+u<l) ; '/. Specification of h(t,u) 



[W,PW] = cs 


ort(H.P); 


disp([W;PW] 


') 




-2.0000 




0.0198 







0.2700 


1.0000 




0.1900 


3.0000 




0.2400 


4.0000 




0.0270 


5.0000 




0.0774 


8.0000 




0.0558 


16.0000 




0.0180 


17.0000 




0.0516 


20.0000 




0.0372 


32.0000 




0.0132 


ddbn 






Enter row mat 


rix of values W 


Enter row mat 


rix of probabilities PW 


print 







'/. Distribution for W = h(X,Y) 



7, Plot of distribution function 



'/. See Figure~10.4 
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Distribution Function 



0.7- 




0.4- 
0.3- 
0.2- 
0.1 - 



0' 9^ L 

-5 



10 15 20 25 30 35 40 

t 



Figure 10.4: Distribution for random variable W in Example 10.20 (Continuation of Example 10.19). 



Joint distributions for two functions of (X, Y) 

In previous treatments, we use csort to obtain the marginal distribution for a single function Z = g (X, Y). 
It is often desirable to have the joint distribution for a pair Z = g(X,Y) and W = h(X,Y). As special 
cases, we may have Z = X or W = Y. Suppose 



Z has values \z\ Z2 ■ ■ ■ z c ] and W has values [i«i u>2 



(10.43) 



The joint distribution requires the probability of each pair, P (W = Wi, Z = Zj). Each such pair of values 
corresponds to a set of pairs of X and Y values. To determine the joint probability matrix PZW for (Z, W) 
arranged as on the plane, we assign to each position (i,j) the probability P (W = Wi, Z = Zj), with values 
of W increasing upward. Each pair of (W, Z) values corresponds to one or more pairs of (Y, X) values. If we 
select and add the probabilities corresponding to the latter pairs, we have P (W = Wi, Z = Zj). This may be 
accomplished as follows: 

1. Set up calculation matrices t and u as with jcalc. 

2. Use array arithmetic to determine the matrices of values G = [g (t, u)} and H = [h (t, u)]. 

3. Use csort to determine the Z and W value matrices and the PZ and PW marginal probability matrices. 

4. For each pair (wi, Zj), use the MATLAB function find to determine the positions a for which 



(H==W(i))&(G==Z(j)) 



(10.44) 



5. Assign to the (i, j) position in the joint probability matrix PZW for (Z, W) the probability 

PZW(i,j) = total (P (a)) (10.45) 
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We first examine the basic calculations, which are then implemented in the m-procedure jointzw. 
Example 10.21: Illustration of the basic joint calculations 



*/. i 


'il 


_e jdemo7.m 










P = [0 


061 


.030 C 


J.060 


.027 0.009; 







015 


0. 


001 





.048 


0.058 0.013 









040 


0. 


054 





.012 


0.004 0.013 









032 


0. 


029 





.026 


0.023 0.039 









058 


0. 


040 





.061 


0.053 0.018 









050 


0. 


052 





.060 


0.001 0.013]; 


X = -2 


2, 














Y = -2 


3, 














jdemo7 














'/, Call for data in jdemo7.m 


jcalc 














'/, Used to set up calculation matr 


H = u.' 


-2 












'/. Matrix of values for W = h(X,Y) 


H = 
















9 




9 




9 




9 


9 


4 




4 




4 




4 


4 


1 




1 




1 




1 


1 























1 




1 




1 




1 


1 


4 




4 




4 




4 


4 


G = abs(t) 










'/. Matrix of values for Z = g(X,Y) 


G = 
















2 




1 









1 


2 


2 




1 









1 


2 


2 




1 









1 


2 


2 




1 









1 


2 


2 




1 









1 


2 


2 




1 









1 


2 


[W,PW] 


= 


csort (F 


,P) 






'/, Determination of marginal for W 


W = 


C 


) 


1 




4 


9 




PW = 


C 


J.1490 





3530 


0.3110 0.1870 


[Z,PZ] 


= 


csort (C 


,P) 






'/, Determination of marginal for Z 


z = 


c 


) 


1 




2 






PZ = 


c 


J.2670 





3720 


0.3610 


r = W(3) 












'/, Third value for W 


r = 4 














s = Z(2) 












'/, Second value for Z 


s = 1 


. 

















To determine P (W = 4,Z = 1), we need to determine the (t,u) positions for which this pair of 
(W, Z) values is taken on. By inspection, we find these to be (2,2), (6,2), (2,4), and (6,4). Then 
P (W = 4, Z = 1) is the total probability at these positions. This is 0.001 + 0.052 + 0.058 + 0.001 
= 0.112. We put this probability in the joint probability matrix PZW at the W = 4, Z = 1 
position. This may be achieved by MATLAB with the following operations. 

[i,j] = find((H==W(3))&(G==Z(2))) ; '/. Location of (t,u) positions 
disp([i j]) '/, Optional display of positions 
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2 2 

6 2 

2 4 

6 4 

a = find((H==W(3))&(G==Z(2))); 
PO = zeros(size(P) ) ; 
P0(a) = P(a) 
PO = 



0.0010 







0.0520 
PZW = zeros (length (W) .length (Z)) 
PZW(3,2) = total(P(a)) 
PZW =0 



0.1120 





'/, Location in more convenient form 

7. Setup of zero matrix 

7. Display of designated probabilities in P 















0.0580 



































0.0010 






7. Initialization of PZW matrix 
7. Assignment to PZW matrix with 
'/, W increasing downward 



PZW = flipud(PZW) 
PZW = 



0.1120 







7. Assignment with W increasing upward 



The procedure jointzw carries out this operation for each possible pair of W and Z values (with 
the f lipud operation coming only after all individual assignments are made). 

Example 10.22: Joint distribution for Z = g(X,Y) = \\X\ - Y\ and W = h {X, Y) = \XY\ 



7, file jdemo3.m 


X = [-4 -2 1 


3]; 


Y = [0124]; 




P = [0.0132 


0.0198 


0.0372 


0.0558 


0.0516 


0.0774 


0.0180 


0.0270 


jdemo3 


7. Cal 


jointzw 


7. Cal 



data for joint simple distribution 



0.0297 0.0209 

0.0837 0.0589 

0.1161 0.0817 

0.0405 0.0285 

Dr data 
7. Call for m-program 
Enter joint prob for (X,Y) : P 
Enter values for X: X 
Enter values for Y: Y 

Enter expression for g(t,u): abs(abs(t) -u) 
Enter expression for h(t,u): abs(t.*u) 
Use array operations on Z, W, PZ, PW, v, w, PZW 
disp(PZW) 

0.0132 

0.0264 

0.0570 



0.0264; 
0.0744; 
0.1032; 
0.0360]; 
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0.0744 













0.0558 








0.0725 













0.1032 













0.1363 













0.0817 
















0.0405 


0.1446 


0.1107 


0.0360 


0.0477 


EZ 


= total(v. 


+PZW) 








EZ 


1.4398 











ez = Z+PZ' 7 Alternate, using marginal dbn 

ez = 1.4398 

EW = total (w.*PZW) 

EW = 2.6075 

ew = W+PW 7, Alternate, using marginal dbn 

ew = 2.6075 

M = v > w; 7 P(Z>W) 

PM = total (M.*PZW) 

PM = 0.3390 

At noted in the previous section, if {X, Y} is an independent pair and Z = g(X), 
W = h (Y), then the pair {Z, W} is independent. However, if Z = g (X, Y) and 
W = h(X,Y), then in general the pair {Z,W} is not independent. We may illustrate this with the aid 

of the m-procedure jointzw 

Example 10.23: Functions of independent random variables 



jdemo3 
itest 

Enter matrix of joint probabilities P 
The pair {X,Y} is independent 
jointzw 

Enter joint prob for (X,Y) : P 
Enter values for X: X 
Enter values for Y: Y 

Enter expression for g(t,u): t.~2 - 3*t 
Enter expression for h(t,u): abs(u) + 3 
Use array operations on Z, W, PZ, PW, v, w, PZW 
itest 

Enter matrix of joint probabilities PZW 
The pair {X,Y} is independent 
jdemo3 
jointzw 

Enter joint prob for (X,Y) : P 
Enter values for X: X 
Enter values for Y: Y 
Enter expression for g(t,u): t+u 
Enter expression for h(t,u): t . *u 



7. The pair {X,Y} is independent 



7 z = g(x) 

7. W = h(Y) 



7. The pair {g(X),h(Y)> is independent 
7. Refresh data 



7. Z = g(X,Y) 
7 W = h(X,Y) 



Use array operations on Z, W, PZ, PW, v, w, PZW 



itest 
Enter matrix of joint probabilities PZW 
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The pair {X,Y} is NOT independent '/. The pair {g(X, Y) ,h(X, Y)> is not indep 
To see where the product rule fails, call for D '/, Fails for all pairs 



10.2,3 Absolutely continuous case: analysis and approximation 

As in the analysis Joint Distributions, we may set up a simple approximation to the joint distribution and 
proceed as for simple random variables. In this section, we solve several examples analytically, then obtain 
simple approximations. 

Example 10.24: Distribution for a product 

Suppose the pair {X, Y} has joint density fxy- Let Z = XY. Determine Q v such that P [Z < v) = 
P((X,Y)eQ v ). 



u = v/t v>0 





Figure 10.5: Region Q v for product XY, v > 0. 



SOLUTION (see Figure 10.5) 

Q v = {(£, u) : tu < v} = {(£, u) : t > 0, u < v/t} \J{{t, u) : t < 0, u > v/t}} (10.46) 
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v 1 

P(XY <= v) = area of shaded 
region for <= v <= 1 



Figure 10.6: Product of X, Y with uniform joint distribution on the unit square. 



Example 10.25 

{X, Y} ~ uniform on unit square 

fxY (*,«) = !, < t < 1, < u < 1. Then (see Figure 10.6) 



P (XY < v) -- 
Integration shows 



1 dudt where Q v = {(£, u) : < t < 1, 0<u< min{l, v/t}} 



F z {y) = P {XY <v) = v{l-ln (v)) so that f z (v) = -In (v) = In (1/v) , < v < 1 
For v = 0.5, F z (0.5) = 0.8466. 

°L Note that although f = 1, it must be expressed in terms of t, u. 
tuappr 

Enter matrix [a b] of X-range endpoints [0 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density (u>=0)&(t>=0) 
Use array operations on X, Y, PX, PY, t, u, and P 
G = t.*u; 



(10.47) 



(10.48) 



[Z,PZ] = csort(G,P) ; 
p = (Z<=0.5)*PZ' 
p = 0.8465 



7, Theoretical value 0.8466, above 
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Example 10.26: Continuation of Example 5 (Example 8.5: Marginals for a discrete 
distribution) from "Random Vectors and Joint Distributions" 

The pair {X, Y} has joint density fxy {t, u) = tL (t + 2u) on the region bounded by t = 0, t = 2, 
u = 0, and u = max{l, t} (see Figure 7). Let Z = XY. Determine P (Z < 1). 




f^W = (6/37)(t + 2u) 

Figure 10.7: Area of integration for Example 10.26 (Continuation of Example 5 (Example 8.5: 
Marginals for a discrete distribution) from "Random Vectors and Joint Distributions"). 



ANALYTIC SOLUTION 



P (Z < 1) = P {{X, Y) e Q) where Q = {(t, u) : u < l/t} 
Reference to Figure 10.7 shows that 



(10.49) 



P ((X, Y) E Q) = §f Jq 1 / X (t + 2u) dudt+§ f jl / V * (t + 2u) dudt = 9/37 + 9/37 
18/37^0.4865 

APPROXIMATE SOLUTION 

tuappr 
Enter matrix [a b] of X-range end/points [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 300 
Enter number of Y approximation points 300 

Enter expression for joint density (6/37) *(t + 2*u) . * (u<=max(t , 1) ) 
Use array operations on X, Y, PX, PY, t, u, and P 
Q = t.*u<=l; 
PQ = total(Q.*P) 
PQ = 0.4853 '/, Theoretical value 0.4865, above 



(10.50) 
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G = t.*u; 

[Z,PZ] = csort(G,P); 
PZ1 = (Z<=1)*PZ' 
PZ1 = 0.4853 



'/, Alternate, using the distribution for Z 



In the following example, the function g has a compound definition. That is, it has a different rule for 
different parts of the plane. 




u = 1/2 



Figure 10.8: Regions for P (Z < 1/2) in Example 10.27 (A compound function) 



Example 10.27: A compound function 

The pair {X, Y} has joint density fxy {t, u) = | (i + 2u) on the unit square 0<t<l,0<u<l. 

Y for X 2 - Y > 



Z = { 



X + Y for X 2 - Y < 



Iq (X, Y)Y + I Q o (X, Y) (X + Y) 



(10.51) 



for Q = {{t,u) : u < t 2 }. Determine P (Z < = 0.5). 
ANALYTICAL SOLUTION 

P (Z < 1/2) = P (Y < 1/2, Y < X 2 ) + P (X + Y < 1/2, Y > X 2 ) = P ((AT, Y) e Qa \J Qb) (10.52) 

where Qa = {(£, u) : u < 1/2, u < t 2 } and Qb = {{t, u) : t + u < 1/2, u > t 2 }. Reference to 
Figure 10.8 shows that this is the part of the unit square for which u < min (max (l/2 — t, t 2 ) , 1/2). 
We may break up the integral into three parts. Let 1/2 — t\ = t\ and t\ = 1/2. Then 



P(Z < 1/2) 

2 fl fV2 



/„" /„" 2 -' (t 



2 
3 JO 



2u) dudt 



2 r*2 r* 

3 



/^ £ (t + 2«) dudt 



(10.53) 



3 It 2 Jo ( f + 2U ) dudt = - 2322 
APPROXIMATE SOLUTION 
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tuappr 
Enter matrix [a b] of X-range end/points [0 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density (2/3) * (t + 2*u) 
Use array operations on X, Y, PX, PY, t, u, and P 
Q = u <= t.~2; 
G = u.*Q + (t + u) .*(1-Q) ; 
prob = total ( (G<=l/2) .*P) 
prob = 0.2328 '/. Theoretical is 0.2322, above 

The setup of the integrals involves careful attention to the geometry of the system. Once set up, the 
evaluation is elementary but tedious. On the other hand, the approximation proceeds in a straightforward 
manner from the normal description of the problem. The numerical result compares quite closely with the 
theoretical value and accuracy could be improved by taking more subdivision points. 

10.3 The Quantile Function 3 
10.3.1 The Quantile Function 

The quantile function for a probability distribution has many uses in both the theory and application of 
probability. If F is a probability distribution function, the quantile function may be used to "construct" a 
random variable having F as its distributions function. This fact serves as the basis of a method of simulating 
the "sampling" from an arbitrary distribution with the aid of a random number generator. Also, given any 
finite class 

{Xi : 1 < i < n) of random variables, an independent class {Y t : 1 < i < n} may be constructed, with 
each Xj and associated Y,- having the same (marginal) distribution. Quantile functions for simple random 
variables may be used to obtain an important Poisson approximation theorem (which we do not develop 
in this work). The quantile function is used to derive a number of useful special forms for mathematical 
expectation. 

General concept — properties, and examples 

If F is a probability distribution function, the associated quantile function Q is essentially an inverse 
of F. The quantile function is defined on the unit interval (0, 1). For F continuous and strictly increasing 
at t, then Q (it) = t iff F (t) = u. Thus, if u is a probability value, t = Q (u) is the value of t for which 
P(X <t)=u. 

Example 10.28: The Weibull distribution (3, 2, 0) 



u = F (t) = 1 - e~ 3 * £>0 => t = Q(u) = y/-ln(l-u)/3 (10.54) 

Example 10.29: The Normal Distribution 

The m-function norminv, based on the M ATLAB function erfinv (inverse error function) , calculates 
values of Q for the normal distribution. 

The restriction to the continuous case is not essential. We consider a general definition which applies to any 
probability distribution function. 

Definition: If F is a function having the properties of a probability distribution function, then the 
quantile function for F is given by 

Q (u) = inf{t : F (t) > u} V u £ (0, 1) (10.55) 



3 This content is available online at <http://cnx.Org/content/m23385/l.6/>. 
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We note 



If F (t*) > u*, then t* > inf{t : F (i) > u*} = Q ( 
If F (**) < u*, then t* < inf{t : F(t)>u*} = Q{ 



Hence, we have the important property: 

(Ql)Q(u) <tffiu<F(i) Vue (0,1). 

The property (Ql) implies the following important property: 

(Q2)If U ~ uniform (0, 1), then X = Q(U) has distribution function Fx = F. To see this, note that 
F x (t) = P[Q(U)<t} = P[U<F (*)] = F (t). 

Property (Q2) implies that if F is any distribution function, with quantile function Q, then the random 
variable X = Q (U), with U uniformly distributed on (0, 1), has distribution function F. 

Example 10.30: Independent classes with prescribed distributions 

Suppose {Xi : 1 < i < n} is an arbitrary class of random variables with corresponding distribution 
functions {Fi : 1 < i < n). Let {Qi : 1 < i < n} be the respective quantile functions. There is 
always an independent class {Ui : 1 < i < n} iid uniform (0, 1) (marginals for the joint uniform 
distribution on the unit hypercube with sides (0, 1)). Then the random variables Yi = Qi (Ui) , 1 < 
i < n, form an independent class with the same marginals as the X;. 

Several other important properties of the quantile function may be established. 
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(a) 



u = F(t) 



-1 
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3 


4 
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(b) 
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(c) 



4 




t 
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3 










2 










1 















/ 








1 


-1 













Figure 10.9: Graph of quantile function from graph of distribution function, 



1. Q is left-continuous, whereas F is right-continuous. 

2. If jumps are represented by vertical line segments, construction of the graph of u 
obtained by the following two step procedure: 

• Invert the entire figure (including axes), then 

• Rotate the resulting figure 90 degrees counterclockwise 



Q (t) may be 
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This is illustrated in Figure 10.9. If jumps are represented by vertical line segments, then jumps go 
into flat segments and flat segments go into vertical segments. 
3. If X is discrete with probability p; at tj, 1 < i < n, then F has jumps in the amount p; at each t; and 
is constant between. The quantile function is a left-continuous step function having value t; on the 
interval (bi-i,bi\, where &o = and 6, = J2)=iPj- This may be stated 

IiF(ti) = b t ,thenQ(u) = tiforF (t^) <u< F (U) (10.56) 

Example 10.31: Quantile function for a simple random variable 

Suppose simple random variable X has distribution 

X = [-2 1 3] PX = [0.2 0.1 0.3 0.4] (10.57) 

Figure 1 shows a plot of the distribution function Fx- It is reflected in the horizontal axis then 
rotated counterclockwise to give the graph of Q (u) versus u. 
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Figure 10.10: Distribution and quantile functions for Example 10.31 (Quantile function for a simple 
random variable). 



We use the analytic characterization above in developing a number of m-functions and m-procedures. 
m-procedures for a simple random variable 

The basis for quantile function calculations for a simple random variable is the formula above. This is 
implemented in the m-function dquant, which is used as an element of several simulation procedures. To plot 
the quantile function, we use dquanplot which employs the stairs function and plots X vs the distribution 
function FX. The procedure dsample employs dquant to obtain a "sample" from a population with simple 
distribution and to calculate relative frequencies of the various values. 

Example 10.32: Simple Random Variable 



X = [-2.3 -1.1 3.3 5.4 7.1 9.8] ; 
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PX = 0.01* [18 15 23 19 13 12] 

dquanplot 

Enter VALUES for X X 

Enter PROBABILITIES for X PX 

rand(' seed' ,0) 

dsample 

Enter row matrix of values X 

Enter row matrix of probabilities 

Sample size n 10000 



°L See Figure~10.11 for plot of results 
7, Reset random number generator for reference 



PX 



Value 


Prob Rel 


freq 


-2.3000 




0.1800 


0.1805 




-1.1000 




0.1500 


0.1466 




3.3000 




0.2300 


0.2320 




5.4000 




0.1900 


0.1875 




7.1000 




0.1300 


0.1333 




9.8000 




0.1200 


0.1201 




Sample averag 


3 ex = 3 


.325 




Population 


mean E [X] 


= 3.305 




Sample variance = 16 . 


32 




Population 


variance Var[X] = 16 


.33 



10r 



Plot of Quantile Function 



0- 



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

u 



Figure 10.11: Quantile function for Example 10.32 (Simple Random Variable). 



Sometimes it is desirable to know how many trials are required to reach a certain value, or one of a set of 
values. A pair of m-procedures are available for simulation of that problem. The first is called targetset. It 
calls for the population distribution and then for the designation of a "target set" of possible values. The 
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second procedure, targetrun, calls for the number of repetitions of the experiment, and asks for the number 
of members of the target set to be reached. After the runs are made, various statistics on the runs are 
calculated and displayed. 

Example 10.33 

X = [-1.3 0.2 3.7 5.5 7.3] ; '/. Population values 
PX = [0.2 0.1 0.3 0.3 0.1] ; '/. Population probabilities 
E = [-1.3 3.7]; '/. Set of target states 

targetset 

Enter population VALUES X 
Enter population PROBABILITIES PX 
The set of population values is 

-1.3000 0.2000 3.7000 5.5000 7.3000 
Enter the set of target values E 
Call for targetrun 

rand ( 'seed' ,0) '/, Seed set for possible comparison 

targetrun 

Enter the number of repetitions 1000 
The target set is 

-1.3000 3.7000 
Enter the number of target values to visit 2 
The average completion time is 6.32 
The standard deviation is 4.089 
The minimum completion time is 2 
The maximum completion time is 30 
To view a detailed count, call for D. 
The first column shows the various completion times; 
the second column shows the numbers of trials yielding those times 
'/, Figure 10.6.4 shows the fraction of runs requiring t steps or less 
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t = number of steps to complete run 



25 



30 



Figure 10.12: Fraction of runs requiring t steps or less. 



m-procedures for distribution functions 

A procedure dfsetup utilizes the distribution function to set up an approximate simple distribution. 
The m-procedure quanplot is used to plot the quantile function. This procedure is essentially the same as 
dquanplot, except the ordinary plot function is used in the continuous case whereas the plotting function 
stairs is used in the discrete case. The m-procedure qsample is used to obtain a sample from the population. 
Since there are so many possible values, these are not displayed as in the discrete case. 

Example 10.34: Quantile function associated with a distribution function. 



F = '0.4*(t + l).*(t < 0) + (0.6 + 0.4*t).*(t >= 0) ; 
dfsetup 

Distribution function F is entered as a string 
variable, either defined previously or upon call 
Enter matrix [a b] of X-range endpoints [-1 1] 
Enter number of X approximation points 1000 
Enter distribution function F as function of t F 
Distribution is in row matrices X and PX 
quanplot 

Enter row matrix of values X 
Enter row matrix of probabilities PX 



'/. String 
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Probability increment h 0.01 

qs ample 

Enter row matrix of X values X 

Enter row matrix of X probabilities PX 

Sample size n 1000 

Sample average ex = -0.004146 

Approximate population mean E(X) = -0.0004002 

Sample variance vx = . 25 

Approximate population variance V(X) = 0.2664 



7, See Figure~10.13 for plot 



7, Theoretical = 



Plot ot Quantile Function 




0.7 0.8 0.9 1 



Figure 10.13: Quantile function for Example 10.34 (Quantile function associated with a distribution 
function.). 



m-procedures for density functions 

An m- procedure acsetup is used to obtain the simple approximate distribution. This is essentially the 
same as the procedure tuappr, except that the density function is entered as a string variable. Then the 
procedures quanplot and qsample are used as in the case of distribution functions. 

Example 10.35: Quantile function associated with a density function. 



acsetup 
Density f is entered as a string variable, 
either defined previously or upon call. 
Enter matrix [a b] of x-range endpoints [0 3] 
Enter number of x approximation points 1000 

Enter density as a function of t ' (t . ~2) . *(t<l) + (1- t/3) . *(l<=t) ' 
Distribution is in row matrices X and PX 
quanplot 

Enter row matrix of values X 
Enter row matrix of probabilities PX 
Probability increment h 0.01 7. See Figure~10.14 for plot 
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rand(' seed' ,0) 

qs ample 

Enter row matrix of values X 

Enter row matrix of probabilities PX 

Sample size n 1000 

Sample average ex = 1 . 352 

Approximate population mean E(X) = 1.361 '/, Theoretical = 49/36 = 1.3622 

Sample variance vx = . 3242 

Approximate population variance V(X) = 0.3474 '/, Theoretical = 0.3474 



Plot of Quantile Function 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0. 



Figure 10.14: Quantile function for Example 10.35 (Quantile function associated with a density func- 
tion.). 



10.4 Problems on Functions of Random Variables 4 

Exercise 10.1 (Solution on p. 294.) 

Suppose X is a nonnegative, absolutely continuous random variable. Let Z = g(X) = Ce~ aX , 
where a > 0, C > 0. Then < Z < C. Use properties of the exponential and natural log function 
to show that 



F z (v) = l- F x 



ln(v/C) 



for < v < C 



(10.58) 



4 This content is available online at <http://cnx.org/content/m24315/!. 4/>. 
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Exercise 10.2 (Solution on p. 294.) 

Use the result of Exercise 10.1 to show that if X ~ exponential (A), then 

Fz (v) = (^) ° < v < C (10.59) 

Exercise 10.3 (Solution on p. 294.) 

Present value of future costs. Suppose money may be invested at an annual rate a, compounded 
continually. Then one dollar in hand now, has a value e ax at the end of x years. Hence, one dollar 
spent x years in the future has a present valuee~ ax . Suppose a device put into operation has time 
to failure (in years) X ~ exponential (A). If the cost of replacement at failure is C dollars, then 
the present value of the replacement is Z = Cer aX . Suppose A = 1/10, a = 0.07, and C = $1000. 

a. Use the result of Exercise 10.2 to determine the probability Z < 700, 500, 200. 

b. Use a discrete approximation for the exponential density to approximate the probabilities in 
part (a). Truncate X at 1000 and use 10,000 approximation points. 

Exercise 10.4 (Solution on p. 294.) 

Optimal stocking of merchandise. A merchant is planning for the Christmas season. He intends 
to stock m units of a certain item at a cost of c per unit. Experience indicates demand can be 
represented by a random variable D ~ Poisson (/z). If units remain in stock at the end of the 
season, they may be returned with recovery of r per unit. If demand exceeds the number originally 
ordered, extra units may be ordered at a cost of s each. Units are sold at a price p per unit. If 
Z = g (D) is the gain from the sales, then 

• For t < m, g (t) = (p — c)t — (c — r) (m — t) = (p — r)t + (r — c)m 

• For t > m, g (t) = (p — c) m + (t — m) (jp — s) = (p — s) t + (s — c) m 

Let M = (-oo, to]. Then 

g (t) = I M (t) \{p -r)t+{r-c)m} + I M (t) [(p -s)t+(s-c)m] (10.60) 

= {p- s)t+{s-c)m + I M (t) (s -r){t- m) (10.61) 

Suppose n = 50 to = 50 c = 30 p = 50 r = 20 s = 40.. 
Approximate the Poisson random variable D by truncating at 100. Determine P (500 < Z < 1100). 

Exercise 10.5 (Solution on p. 295.) 

(See Example 2 (Example 10.2: Price breaks) from "Functions of a Random Variable") The 
cultural committee of a student organization has arranged a special deal for tickets to a concert. 
The agreement is that the organization will purchase ten tickets at $20 each (regardless of the 
number of individual buyers). Additional tickets are available according to the following schedule: 

• 11-20, $18 each 

• 21-30, $16 each 

• 31-50, $15 each 

• 51-100, $13 each 

If the number of purchasers is a random variable X, the total cost (in dollars) is a random quantity 
Z = g (X) described by 

g (X) = 200 + 18/mi (X) {X - 10) + (16 - 18) I M2 (X) {X - 20) + (10.62) 
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(15 - 16) Imz (X) {X - 30) + (13 - 15) I Mi (X) {X - 50) 



(10.63) 



where Ml = [10, oo) , Ml = [20, oo) , M3 = [30, oo) , MA= [50, oo) 



(10.64) 



Suppose X ~ Poisson (75). Approximate the Poisson distribution by truncating at 150. Determine 
P (Z > 1000) , P (Z > 1300), and P (900 <Z< 1400). 

Exercise 10.6 (Solution on p. 295.) 

(See Exercise 6 (Exercise 8.6) from "Problems on Random Vectors and Joint Distributions", and 
Exercise 1 (Exercise 9.1) from "Problems on Independent Classes of Random Variables")) The pair 
{X, Y} has the joint distribution 

(in m-file npr08_06.m (Section 17.8.37: npr08_06)): 



X = [-2.S -0.7 1.1 3.9 5.1] Y = [1.3 2.5 4.1 5.3] 



(10.65) 



P 



0.0483 0.0357 0.0420 0.0399 0.0441 

0.0437 0.0323 0.0380 0.0361 0.0399 

0.0713 0.0527 0.0620 0.0609 0.0551 

0.0667 0.0493 0.0580 0.0651 0.0589 

3X 2 Y 



Y J 



(10.66) 



Determine P (max{X, Y} < 4) , P (\X -Y\> 3). Let Z = 3X 3 
Determine P (Z < 0) and P (-5 < Z < 300). 

Exercise 10.7 (Solution on p. 295.) 

(See Exercise 2 (Exercise 9.2) from "Problems on Independent Classes of Random Variables") The 
pair {X, Y} has the joint distribution (in m-file npr09_02.m (Section 17.8.41: npr09_02)): 



X 



-3.9 



1.7 1.5 2 8 4.1] Y = [-2 1 2.6 5.1] 



(10.67) 



0.0589 0.0342 0.0304 0.0456 0.0209 

0.0961 0.0556 0.0498 0.0744 0.0341 

0.0682 0.0398 0.0350 0.0528 0.0242 

0.0868 0.0504 0.0448 0.0672 0.0308 

Y 2 < 10). 



(10.68) 



Determine P {{X + Y > 5} U {Y < 2}), P (X 2 

Exercise 10.8 (Solution on p. 296.) 

(See Exercise 7 (Exercise 8.7) from "Problems on Random Vectors and Joint Distributions", and 
Exercise 3 (Exercise 9.3) from "Problems on Independent Classes of Random Variables") The pair 
{X, Y} has the joint distribution 

(in m-file npr08_07.m (Section 17.8.38: npr08_07)): 



P(X = t, Y = u) 



(10.69) 
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t = 


-3.1 


-0.5 


1.2 


2.4 


3.7 


4.9 


u = 7.5 


0.0090 


0.0396 


0.0594 


0.0216 


0.0440 


0.0203 


4.1 


0.0495 





0.1089 


0.0528 


0.0363 


0.0231 


-2.0 


0.0405 


0.1320 


0.0891 


0.0324 


0.0297 


0.0189 


-3.8 


0.0510 


0.0484 


0.0726 


0.0132 





0.0077 



Determine P (X 2 - 3X < 0) , P (X 

Exercise 10.9 

For the pair {X, Y} in Exercise 10.8, let Z 
the distribution function for Z. 

Exercise 10.10 

For the pair {X, Y} in Exercise 10.8, let 



Table 10.2 

3|r| <3V 



(Solution on p. 296.) 

g (X, Y) = 3X 2 + 2XY - Y 2 . Determine and plot 



(Solution on p. 296.) 



W = g(X,Y) = { 



X 
2Y 



for X 
for X 



Y < 4 

Y >4 



I M (X,Y)X + I M o(X,Y)2Y 



(10.70) 



Determine and plot the distribution function for W. 
For the distributions in Exercises 10-15 below 

a. Determine analytically the indicated probabilities. 

b. Use a discrete approximation to calculate the same probablities.' 

Exercise 10.11 



(Solution on p. 296.) 

Txy \iy>'U) = -^ yz,i -t- 3u 2 ) for < t < 2, < u < 1 + t (see Exercise 15 (Exercise 8.15) from 
"Problems on Random Vectors and Joint Distributions"). 



fxY (t, w) = M ( 2t 



I l0A] (X)AX + I (h2] (X)(X + Y) 



(10.71) 



Determine P (Z < 2) 
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0.5 1 2 

Problem P10-11 




0.5 1 2 

Problem P10-12 



Figure 10.15 



Exercise 10.12 (Solution on p. 297.) 

fxY (t,u) = \\tu for < t < 2, < u < mm{l,2 - t} (see Exercise 17 (Exercise 8.17) from 
"Problems on Random Vectors and Joint Distributions"). 



Z = I M (X, Y) -X + I M c (X, Y) Y 2 , M = {(t,u) : u > t} 



(10.72) 



Determine P (Z < 1/4). 
Exercise 10.13 



(Solution on p. 297.) 



fxY (t, u) = ^ (t + 2u) for < t < 2, < u < max{2 - t, i) (see Exercise 18 (Exercise 8.18) from 
"Problems on Random Vectors and Joint Distributions"). 

Z = I M (X,Y)(X + Y) + I M c(X,Y)2Y, M = {(t,u) : max(t,u) < 1} (10.73) 

Determine P (Z < 1). 
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12 

Problem P10-13 




1 

Problem P10-14 



Figure 10.16 



Exercise 10.14 (Solution on p. 298.) 

f XY (t,u) = j% (3t 2 + u), for < t < 2, < u < mm{2,3 - t} (see Exercise 19 (Exercise 8.19) 
from "Problems on Random Vectors and Joint Distributions"). 



Z = I M {X,Y){X + Y)+I M c(X,Y)2Y 2 , M = {{t,u) : t < 1, u > 1} 



(10.74) 



Determine P {Z < 2). 

Exercise 10.15 (Solution on p. 298.) 

f XY (t, u) = ^f (3i + 2tu), for < t < 2, < u < min{\ + t, 2} (see Exercise 20 (Exercise 8.20) 
from "Problems on Random Variables and Joint Distributions") 

Z = I M (X,Y)X + I M ,(X,Y) — , M={(t,u):u<min(l,2-t)} (10.75) 

Detemine P (Z < 1). 
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t= 2 



12 

Problem P10-15 

Figure 10.17 



Exercise 10.16 

The class {X, Y, Z} is independent. 

X = —21 a + Ib + 31c- Minterm probabilities are (in the usual order) 

0.255 0.025 0.375 0.045 0.108 0.012 0.162 0.018 
Y = Id + 'He + If — 3. The class {D, E, F} is independent with 

P (D) = 0.32 P (E) = 0.56 P {F) = 0.40 
Z has distribution 



(Solution on p. 299.) 



(10.76) 
(10.77) 



Value 


-1.3 


1.2 


2.7 


3.4 


5.8 


Probability 


0.12 


0.24 


0.43 


0.13 


0.08 



Table 10.3 

Determine P (X 2 + 3XY 2 > 3Z). 

Exercise 10.17 (Solution on p. 299.) 

The simple random variable X has distribution 

X=[-3.1 -0.5 1.2 2.4 3.7 4.9] PX = [0.15 0.22 0.33 0.12 0.11 0.07] (10.78) 



a. Plot the distribution function Fx and the quantile function Qx- 

b. Take a random sample of size n = 10, 000. Compare the relative frequency for each value 
with the probability that value is taken on. 



294 



CHAPTER 10. FUNCTIONS OF RANDOM VARIABLES 



Solutions to Exercises in Chapter 10 

Solution to Exercise 10.1 (p. 287) 

Z = Ce~ aX <viS e~ aX < v/C iff -aX < In (v/C) iff X > -In (v/C) /a, so that 



F z (v) = P (Z < v) = P {X > -In {v/C) /a) = 1 - F x 
Solution to Exercise 10.2 (p. 287) 



In {v/C) 



F z 0) = 1 - 
Solution to Exercise 10.3 (p. 288) 



1 — exp I • In {v/C) 

a 



v \ x / a 
C> 



P{Z <v) 



1000 



10/7 



(10.79) 



(10.80) 



(10.81) 



v = [700 500 200] ; 
P = (v/1000) .-(10/7) 
P = 0.6008 0.3715 0.1003 

tappr 

Enter matrix [a b] of x-range end/points [0 1000] 
Enter number of x approximation points 10000 
Enter density as a function of t . l*exp(-t/10) 
Use row matrices X and PX as in the simple case 
G = 1000*exp(-0.07*t) ; 
PM1 = (G<=700)*PX' 
PM1 = 0.6005 
PM2 = (G<=500)*PX' 
PM2 = 0.3716 
PM3 = (G<=200)*PX' 
PM3 = 0.1003 

Solution to Exercise 10.4 (p. 288) 



mu = 50 ; 
D = 0:100; 
c = 30; 
p = 50; 
r = 20; 
s = 40; 
m = 50; 

PD = ipoisson(mu,D) ; 

G = (p - s)*D + (s - c)*m +(s - r)*(D - m).*(D <= m) ; 
M = (500<=G)&(G<=1100); 
PM = M*PD' 
PM = 0.9209 



[Z,PZ] = csort(G,PD) ; 
m= (500<=Z)&(Z<=1100); 
pm = m*PZ' 
pm = 0.9209 



°/. Alternate: use dbn for Z 
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Solution to Exercise 10.5 (p. 288) 

X = 0:150; 
PX = ipoisson(75,X) ; 
G = 200 + 18* (X - 10).*(X>=10) + (16 - 18)*(X - 20).*(X>=20) + ... 

(15 - 16)*(X- 30).*(X>=30) + (13 - 15)*(X - 50) . * (X>=50) ; 
PI = (G>=1000)*PX' 
PI = 0.9288 
P2 = (G>=1300)*PX' 
P2 = 0.1142 

P3 = ((900<=G)&(G<=1400))*PX' 
P3 = 0.9742 

[Z,PZ] = csort(G,PX); '/, Alternate: use dbn for Z 

pi = (Z>=1000)*PZ' 
pi = 0.9288 

Solution to Exercise 10.6 (p. 289) 

npr08_06 (Section~17. 8 . 37: npr08_06) 
Data are in X, Y, P 
jcalc 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
PI = total ( (max (t,u)<=4) . *P) 
PI = 0.4860 

P2 = total((abs(t-u)>3) . *P) 
P2 = 0.4516 

G = 3*t.~3 + 3*t.~2.*u - u.~3; 
P3 = total((G<0) .*P) 
P3 = 0.5420 

P4 = total(((-5<G)&(G<=300)) .*P) 
P4 = 0.3713 

[Z,PZ] = csort(G,P); '/, Alternate: use dbn for Z 

p4 = ((-5<Z)&(Z<=300))*PZ' 
p4 = 0.3713 

Solution to Exercise 10.7 (p. 289) 

npr09_02 (Section~17. 8 .41 : npr09_02) 
Data are in X, Y, P 
jcalc 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
Ml = (t+u>=5) I (u<=2) ; 
PI = total (Ml. *P) 
PI = 0.7054 
M2 = t.~2 + u.~2 <= 10; 
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P2 = total (M2.*P) 
P2 = 0.3282 

Solution to Exercise 10.8 (p. 289) 

npr08_07 (Section~17. 8 . 38: npr08_07) 
Data are in X, Y, P 
jcalc 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
Ml = t."2 - 3*t <=0; 
PI = total (Ml. *P) 
PI = 0.4500 

M2 = t.~3 - 3*abs(u) < 3; 
P2 = total (M2.*P) 
P2 = 0.7876 

Solution to Exercise 10.9 (p. 290) 

G = 3*t.~2 + 2*t.*u - u.~2; '/. Determine g(X,Y) 
[Z,PZ] = csort(G,P); '/. Obtain dbn for Z = g(X,Y) 

ddbn °/. Call for plotting m-procedure 

Enter row matrix of VALUES Z 
Enter row matrix of PROBABILITIES PZ % Plot not reproduced here 

Solution to Exercise 10.10 (p. 290) 

H = t.*(t+u<=4) + 2*u.*(t+u>4) ; 
[W,PW] = csort(H,P); 
ddbn 

Enter row matrix of VALUES W 
Enter row matrix of PROBABILITIES PW % Plot not reproduced here 

Solution to Exercise 10.11 (p. 290) 

P (Z < 2) = P (z e Q = QlMl \J Q2M2\ , where Ml = {{t, u) : < t < 1, < u < 1 + t} (10.82) 

M2 = {(t, u) : 1 < t < 2, < u < 1 + t} (10.83) 

Ql = {(t,u) :0< t < 1/2}, Q2 = {(t,u) : u < 2 - t} (see figure) (10.84) 

Q f 1 / 2 /■!+* Q z-2 ,-1-t cg3 

P= — / (2t + 3w 2 ) dudt + — / (2t + 3u 2 ) dudt = (10.85) 

88 J J 88 J 1 J 5632 

tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 3] 
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Enter number of X approximation points 200 

Enter number of Y approximation points 300 

Enter expression for joint density (3/88)*(2*t + 3*u. ~2) . * (u<=l+t) 

Use array operations on X, Y, PX, PY, t, u, and P 

G = 4*t.*(t<=l) + (t+u) .*(t>l) ; 

[Z,PZ] = csort(G,P) ; 

PZ2 = (Z<=2)*PZ' 

PZ2 = 0.1010 '/. Theoretical = 563/5632 = 0.1000 

Solution to Exercise 10.12 (p. 291) 

P(Z< 1/4) = P^{X,Y)eM 1 Q 1 \JM 2 Q 2 ), M 1 = {{t,u):0<t<u<l} (10.86) 

M 2 = {(t, u) : < t < 2, < t < min (t, 2 - t)} (10.87) 

Q 1 = {(t, u):t< 1/2} Q 2 = {(t, u) : u < 1/2} (see figure) (10.88) 

24 f 1 / 2 f 1 24 f 3 ^ 2 f 1 ^ 2 24 I' 2 l' 2 ^ 85 

P=— / tududt-\ / / tududt+— / tududt= (10.89) 

11 Jo Jo 11 A/2 Jo IW3/2./0 176 

tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 400 
Enter number of Y approximation points 200 

Enter expression for joint density (24/ll)*t . *u. * (u<=min(l,2-t)) 
Use array operations on X, Y, PX, PY, t, u, and P 
G = 0.5*t.*(u>t) + u.~2.*(u<t); 
[Z,PZ] = csort(G,P); 
pp = (Z<=1/4)*PZ' 
pp = 0.4844 '/. Theoretical = 85/176 = 0.4830 

Solution to Exercise 10.13 (p. 291) 

P(Z < 1) = p((X,Y) G M1Q1 \f M2Q2) , Mi = {(t,u) : 0< t < 1, < u< 1 - t} (10.90) 

M 2 = {(t,u) : 1 < t < 2, < u < t) (10.91) 

Qi = {(t, u) : u < 1 - i) Q 2 = {(t, u) :u< 1/2} (see figure) (10.92) 

o j-1 nl-t o j-2 1-1/2 q 



P = — / / (t + 2u) dudt + — / (t + 2u) dudt = — (10.93) 

23 J J 23 7i Jo 46 



tuappr 

Enter matrix [a b] of X-range endpoints [0 2] 

Enter matrix [c d] of Y-range endpoints [0 2] 

Enter number of X approximation points 300 

Enter number of Y approximation points 300 
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Enter expression for joint density (3/23) *(t + 2*u) . * (u<=max(2-t ,t) ) 

Use array operations on X, Y, PX, PY, t, u, and P 

M = max(t,u) <= 1; 

G = M.*(t + u) + (1 - M)*2.*u; 

p = total((G<=l) .*P) 

p = 0.1960 '/. Theoretical = 9/46 = 0.1957 

Solution to Exercise 10.14 (p. 292) 

P (Z < 2) = P ((, Y) G M 1 Q 1 \/ (m 2 \/ M 3 ) Q 2 ) , Mi = {(*, u) : < t < 1, 1 < u < 2} (10.94) 

M 2 = {(*,«) : < t < 1, < u < 1} M 3 = {(t,w) : 1 < t < 2, < u < 3- t} (10.95) 

(5i = {(t, u) : w < 1 - t} Q 2 = {(t, u) : u < 1/2} (see figure) (10.96) 

p = 179 y y ( 3f2 + m ) rfudt + 179 a y ( 3<2 + u ^ dudt = m (10 - 97) 

tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 300 
Enter number of Y approximation points 300 

Enter expression for joint density (12/179) * (3*t . "2 + u) . * (u<=min(2,3-t) ) 
Use array operations on X, Y, PX, PY, t, u, and P 
M = (t<=l)&(u>=l); 
Z = M.*(t + u) + (1 - M)*2.*u.~2; 
G = M.*(t + u) + (1 - M)*2.*u.~2; 
p = total ((G<=2) .*P) 
p = 0.6662 '/. Theoretical = 119/179 = 0.6648 

Solution to Exercise 10.15 (p. 292) 

P(Z <1) = P ((X, Y) G M X Q X \j M 2 Q 2 ) , M x = M, M 2 = M c (10.98) 

Q 1 = {(t,u) :0< t < 1} Q 2 = {(t,u) : u< t} (see figure) (10.99) 

12 f 1 f 1 , , . . 12 f 2 /"* , s . . 124 



P= m] J (^+2tu)dudt+—J J (3t+2tu)dudt=— (10.100) 

tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 400 
Enter number of Y approximation points 400 

Enter expression for joint density (12/227) * (3*t+2*t . *u) .* (u<=min(l+t ,2) ) 
Use array operations on X, Y, PX, PY, t, u, and P 
Q = (u<=l) .*(t<=l) + (t>l).*(u>=2-t) .*(u<=t); 
P = total (Q.*P) 
P = 0.5478 '/. Theoretical = 124/227 = 0.5463 
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Solution to Exercise 10.16 (p. 293) 



'/. file nprl0_16.m (Section~17 .8 .42: nprl0_16) 
ex =[-2130]; 

pmx = 0.001* [255 25 375 45 108 12 162 18]; 
cy = [1 3 1 -3]; 
pmy = minprob(0.01*[32 56 40]); 
Z = [-1.3 1.2 2.7 3.4 5.8] ; 
PZ = 0.01* [12 24 43 13 8] ; 
disp('Data are in ex, pmx, cy, 
nprl0_16 

Data are in ex, pmx, cy, pmy, 
[X,PX] = canonicf (ex, pmx) ; 
[Y,PY] = canonicf (cy, pmy) ; 
icalc3 

Enter row matrix of X-values 
Enter row matrix of Y-values 
Enter row matrix of Z-values 
Enter X probabilities PX 
Enter Y probabilities PY 
Enter Z probabilities PZ 
Use array operations on matrices X, Y, Z, 
PX, PY, PZ, t, u, v, and P 
M = t.~2 + 3*t.*u.~2 > 3*v; 
PM = total(M.*P) 
PM = 0.3587 



Data for Exercise~10. 16 



pmy, Z, PZ') 
'/. Call for data 
Z, PZ 



X 

Y 
Z 



Solution to Exercise 10.17 (p. 293) 



X = [-3.1 -0.5 1.2 2.4 3.7 4.9] ; 
PX = 0.01*[15 22 33 12 11 7]; 
ddbn 

Enter row matrix of VALUES X 

Enter row matrix of PROBABILITIES PX 7. Plot not reproduced here 
dquanplot 

Enter VALUES for X X 
Enter PROBABILITIES for X PX 
rand(' seed' ,0) 
dsample 

Enter row matrix of VALUES X 
Enter row matrix of PROBABILITIES 
Sample size n 10000 



7. Plot not reproduced here 
7. Reset random number generator 
7. for comparison purposes 



PX 



Value 


Prob 


Rel freq 


-3.1000 


0.1500 


0.1490 


-0.5000 


0.2200 


0.2164 


1.2000 


0.3300 


0.3340 


2.4000 


0.1200 


0.1184 


3.7000 


0.1100 


0.1070 


4.9000 


0.0700 


0.0752 


Sample average ex = . 


,8792 


Population 


mean E [X] = 


= 0.859 
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Sample variance vx = 5 . 146 
Population variance Var[X] = 5.112 



Chapter 11 

Mathematical Expectation 

11.1 Mathematical Expectation: Simple Random Variables 1 

11.1.1 Introduction 

The probability that real random variable X takes a value in a set M of real numbers is interpreted as the 
likelihood that the observed value X (w) on any trial will lie in M. Historically, this idea of likelihood is rooted 
in the intuitive notion that if the experiment is repeated enough times the probability is approximately the 
fraction of times the value of X will fall in M. Associated with this interpretation is the notion of the average of 
the values taken on. We incorporate the concept of mathematical expectation into the mathematical model 
as an appropriate form of such averages. We begin by studying the mathematical expectation of simple 
random variables, then extend the definition and properties to the general case. In the process, we note the 
relationship of mathematical expectation to the Lebesque integral, which is developed in abstract measure 
theory. Although we do not develop this theory, which lies beyond the scope of this study, identification of 
this relationship provides access to a rich and powerful set of properties which have far reaching consequences 
in both application and theory. 

11.1.2 Expectation for simple random variables 

The notion of mathematical expectation is closely related to the idea of a weighted mean, used extensively 
in the handling of numerical data. Consider the arithmetic average x of the following ten numbers: 1, 2, 2, 

2, 4, 5, 5, 8, 8, 8, which is given by 

x=— (1 + 2 + 2 + 2 + 4 + 5 + 5 + 8 + 8 + 8) (11.1) 

Examination of the ten numbers to be added shows that five distinct values are included. One of the ten, 
or the fraction 1/10 of them, has the value 1, three of the ten, or the fraction 3/10 of them, have the value 
2, 1/10 has the value 4, 2/10 have the value 5, and 3/10 have the value 8. Thus, we could write 

x = (0.1 • 1 + 0.3 • 2 + 0.1 • 4 + 0.2 • 5 + 0.3 • 8) (11.2) 

The pattern in this last expression can be stated in words: Multiply each possible value by the fraction 
of the numbers having that value and then sum these products. The fractions are often referred to as the 
relative frequencies. A sum of this sort is known as a weighted average. 

In general, suppose there are n numbers {xi, £2, ■■• x n } to be averaged, with m < n distinct values 
{ti, t2, • • • , t m }. Suppose fj have value tj, (2 have value £2, ■ ■ ■ , fm have value t m . The f] must add to n. 



1 This content is available online at <http://cnx.Org/content/m23387/l.5/>. 
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If we set pi = fi/n, then the fraction p; is called the relative frequency of those numbers in the set which 
have the value ti, 1 < i < m. The average x of the n numbers may be written 



1 n m 



(11.3) 



i=l 



J = l 



In probability theory, we have a similar averaging process in which the relative frequencies of the various 
possible values of are replaced by the probabilities that those values are observed on any trial. 

Definition. For a simple random variable X with values {t\, t 2 , ■ ■ ■ , t n } and corresponding probabilities 
Pi = P (X = ti), the mathematical expectation, designated E [X], is the probability weighted average of the 
values taken on by X. In symbols 

n n 

E[X] = Y J UP{X = t i ) = Y J tiPi (11.4) 

Note that the expectation is determined by the distribution. Two quite different random variables may have 
the same distribution, hence the same expectation. Traditionally, this average has been called the mean, or 
the mean value, of the random variable X. 

Example 11.1: Some special cases 

1. Since X = <iIe = 01 E" + 0,1 e, we have E [ciIe] = aP (E). 

2. For X a constant c, X = cIq, so that E [c] = cP (CI) = c. 

3. If X = Yl7=i ti^Ai then aX = J^ILi a tilA t , so that 



E [aX] = Y] aUP (Ai) = o^ UP (A t ) = aE [X] 



(11.5) 



Negative moments 


Positive moments 
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E[X] = sum of moments = center of mass 



Figure 11.1: Moment of a probability distribution about the origin. 
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Mechanical interpretation 

In order to aid in visualizing an essentially abstract system, we have employed the notion of probability 
as mass. The distribution induced by a real random variable on the line is visualized as a unit of probability 
mass actually distributed along the line. We utilize the mass distribution to give an important and helpful 
mechanical interpretation of the expectation or mean value. In Example 6 (Example 11.16: Alternate 
interpretation of the mean value) in "Mathematical Expectation: General Random Variables", we give an 
alternate interpretation in terms of mean-square estimation. 

Suppose the random variable X has values {£, : 1 < i < n}, with P (X = £,) = p,. This produces a 
probability mass distribution, as shown in Figure 1, with point mass concentration in the amount of p; at 
the point t;. The expectation is 



^UPi 



(11.6) 



Now \ti\ is the distance of point mass p; from the origin, with p; to the left of the origin iff t; is negative. 
Mechanically, the sum of the products tiPi is the moment of the probability mass distribution about the 
origin on the real line. From physical theory, this moment is known to be the same as the product of the 
total mass times the number which locates the center of mass. Since the total mass is one, the mean value 
is the location of the center of mass. If the real line is viewed as a stiff, weightless rod with point mass p; 
attached at each value t; of X, then the mean value px is the point of balance. Often there are symmetries 
in the distribution which make it possible to determine the expectation without detailed calculation. 

Example 11.2: The number of spots on a die 

Let X be the number of spots which turn up on a throw of a simple six-sided die. We suppose each 
number is equally likely. Thus the values are the integers one through six, and each probability is 
1/6. By definition 

E\X} = I -1+1-2+1 -3+1 -4+1 -5+1 -6= I (1 + 2 + 3 + 4+5 + 6) = 7 - (11.7) 

bDDDDDD 2 

Although the calculation is very simple in this case, it is really not necessary. The probability 
distribution places equal mass at each of the integer values one through six. The center of mass is 
at the midpoint. 

Example 11.3: A simple choice 

A child is told she may have one of four toys. The prices are $2.50. $3.00, $2.00, and $3.50, 
respectively. She choses one, with respective probabilities 0.2, 0.3, 0.2, and 0.3 of choosing the first, 
second, third or fourth. What is the expected cost of her selection? 

E [X] = 2.00 • 0.2 + 2.50 • 0.2 + 3.00 • 0.3 + 3.50 • 0.3 = 2.85 (11.8) 

For a simple random variable, the mathematical expectation is determined as the dot product of the value 
matrix with the probability matrix. This is easily calculated using MATLAB. 

Example 11.4: MATLAB calculation for Example 3 

X = [2 2.5 3 3.5] ; '/, Matrix of values (ordered) 
; 7, Matrix of probabilities 
'/. The usual MATLAB operation 

7, An alternate calculation 

'/, Another alternate 



PX 


= 


0.1* [2 2 3 


EX 


= 


dot(X,PX) 


EX 


= 


2.8500 


Ex 


= 


sum(X.*PX) 


Ex 


= 


2.8500 


ex 


= 


X*PX' 


ex 


= 


2.8500 
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Expectation and primitive form 

The definition and treatment above assumes X is in canonical form, in which case 

n n 

X = ^UI Ai , where Ai = {X = U}, implies E [X] = J^ t t P (A z ) (11.9) 

i=l i=\ 

We wish to ease this restriction to canonical form. 

Suppose simple random variable X is in a primitive form 

rn 

X = 2_\ c jICj, where {Cj : 1 < j < m} is a partition (11.10) 

We show that 

m 

E\X] = Y J c ] P(C 3 ) (11.11) 

Before a formal verification, we begin with an example which exhibits the essential pattern. Establishing 
the general case is simply a matter of appropriate use of notation. 

Example 11.5: Simple random variable X in primitive form 

X = I Cl + 2I C2 + Ic 3 + 3/c 4 + 2I C5 + 2I Ce , with {C u C 2 , C 3 , C 4 , C 5 . C e } a partition (11.12) 

Inspection shows the distinct possible values of X to be 1, 2, or 3. Also, 

A 1 = {X = 1} = C 1 \/C S , A 2 = {X = 2} = C 2 \J C 5 \J C 6 and A 3 = {X = 3} = C 4 (11.13) 

so that 

P (A t ) = P (Ci) + P (C 3 ) , P (A 2 ) = P (C 2 ) + P (C 5 ) + P (C 6 ) , and P (A 3 ) = P (C 4 ) (11.14) 

Now 

E [X] = P (A,) + 2P (A 2 ) + 3P (A 3 ) = P (d) + P (C 3 ) + 2 [P (C 2 ) + P (C 6 ) + P (Ce)} + 3P (C 4 ) (11.15) 
= P (Ci) + 2P (C 2 ) + P (C 3 ) + 3P (d) + 2P (C 5 ) + 2P (Ce) (11.16) 

To establish the general pattern, consider X = YlT=i c j^c r We identify the distinct set of values contained 
in the set {cj : 1 < j < m). Suppose these are t\ < t 2 < ■ ■ ■ <t n . For any value t; in the range, identify the 
index set J; of those j such that Cj = t{. Then the terms 

Y J c ] Ic J =Uj2 I c j =UI A „ where A t = \J C, (11.17) 

■h -h jeJi 

By the additivity of probability 

P(A t ) = P(X = t i )=Y J P{C j ) (11-18) 

je.h 

Since for each j G Ji we have Cj = ti, we have 

n n n m 

E[X] =Y J UP(A l ) = "£ti E P ( C o) = E E CjPVj) = ^cjPiCj) (11.19) 

i=l i— 1 j£Ji i=l j£Ji 3 = 1 
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— □ 

Thus, the defining expression for expectation thus holds for X in a primitive form. 

An alternate approach to obtaining the expectation from a primitive form is to use the csort operation 
to determine the distribution of X from the coefficients and probabilities of the primitive form. 

Example 11.6: Alternate determinations of E [X] 

Suppose X in a primitive form is 

X = I Cl + 2I C2 + I Ca + 3I Ci + 2I Cs + 2I Cb + I Cl + 3I Cs + 2I Cb + I Cl0 (11.20) 

with respective probabilities 

P(Ci) = 0.08, 0.11, 0.06, 0.13, 0.05, 0.08, 0.12, 0.07, 0.14, 0.16 (H-21) 



c = [1 2 1 3 2 2 1 3 2 1] ; °/„ Matrix of coefficients 

pc = 0.01* [8 11 6 13 5 8 12 7 14 16]; °/„ Matrix of probabilities 
EX = c*pc' 

EX = 1.7800 7, Direct solution 

[X,PX] = csort(c,pc); °/ Determination of dbn for X 

disp([X;PX]') 

1.0000 0.4200 

2.0000 0.3800 

3.0000 0.2000 
Ex = X+PX' 7, E[X] from distribution 

Ex = 1.7800 

Linearity 

The result on primitive forms may be used to establish the linearity of mathematical expectation for 
simple random variables. Because of its fundamental importance, we work through the verification in some 
detail. 

Suppose X = X^"=i ti^Ai an d Y = J2T=i u j^B i (both in canonical form). Since 

n m 

^/ Ai =]T/ B , = 1 (11-22) 

»=i j'=i 

we have 

n I m \ m / n \ n m 

x + y = J2 ^ E *b, + E u M E ** = E E ^ + u i) ^m (n-23) 
i=\ \j=i j j=i \i=i ) i=\ j=i 

Note that IaJb. = lAiB and AiBj = {X = ti, Y = Uj}. The class of these sets for all possible pairs (i, j) 
forms a partition. Thus, the last summation expresses Z = X + Y in a primitive form. Because of the result 
on primitive forms, above, we have 

n m n m n m 

E\x+Y] = Y J Y,^ + u i) p ^ B i) = Y,Y, t * p ^ B -j) + Y,Y, u -j p ( A > B i) (ii- 24 ) 

i— 1 j — 1 i—1 j — 1 i—1 j — 1 

n rn m n 

= E f *E p (^) + E^E p ( A ^) (ii- 25 ) 

i—1 J — 1 i — 1 3=1 
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We note that for each i and for each j 



P{A t ) = Y J P{A l B j ) and P{Bj) = ^P{A i B i ) (11.26) 

3=1 i=l 



Hence, we may write 



E [X + Y] = J^ kP (At) + ^ u j p ( B j) = E[X} + E [Y] (11.27) 

i=l j=l 

Now aX and bY are simple if X and Y are, so that with the aid of Example 11.1 (Some special cases) we 
have 

E [aX + bY] = E [aX] + E [bY] = aE [X] + bE [Y] (11.28) 

If X, Y, Z are simple, then so are aX + bY, and cZ. It follows that 

E [aX + bY + cZ] = E [aX + bY] + cE [Z] = aE [X] + bE [Y] + cE [Z] (11.29) 

By an inductive argument, this pattern may be extended to a linear combination of any finite number of 
simple random variables. Thus we may assert 

Linearity. The expectation of a linear combination of a finite number of simple random variables is that 
linear combination of the expectations of the individual random variables. 
— □ 

Expectation of a simple random variable in afflne form 
As a direct consequence of linearity, whenever simple random variable X is in affine form, then 



E [X] = E 



C + Z^2 C i l Ei 

»=i 



c + Y,c l P(E l ) (11.30) 



Thus, the defining expression holds for any affine combination of indicator functions, whether in canonical 
form or not. 

Example 11.7: Binomial distribution (n, p) 

This random variable appears as the number of successes in n Bernoulli trials with probability p of 
success on each component trial. It is naturally expressed in affine form 

n n 

X = Y^ lE t so that E [X] = ^p = np (11.31) 

i=l i=l 

Alternately, in canonical form 

n 

X = Y J kI Akn , with Pk = P(A kn ) = P{X = k) = C{n, k)p k q n - k , q=l- P (11.32) 

fe=0 

so that 

n 
E\X] = ^ j kC{n,k)p k q n - k , q=l-p (11.33) 

fc=0 

Some algebraic tricks may be used to show that the second form sums to np, but there is no need 
of that. The computation for the affine form is much simpler. 
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Example 11.8: Expected winnings 

A bettor places three bets at $2.00 each. The first bet pays $10.00 with probability 0.15, the 
second pays $8.00 with probability 0.20, and the third pays $20.00 with probability 0.10. What is 
the expected gain? 

SOLUTION 

The net gain may be expressed 

X = 10I A + 8I B + 20I C - 6, with P (A) = 0.15, P (B) = 0.20, P (C) = 0.10 (11.34) 

Then 

E [X] = 10 • 0.15 + 8 • 0.20 + 20 • 0.10 - 6 = -0.90 (11.35) 

These calculations may be done in MATLAB as follows: 

c = [10 8 20 -6] ; 
p = [0.15 0.20 0.10 1.00]; '/. Constant a = al_ (Omega), with P (Omega) = 1 
E = c*p' 
E = -0.9000 

Functions of simple random variables 

If X is in a primitive form (including canonical form) and g is a real function defined on the range of X, 
then 

771 

Z = g (X) = 2_,9 ( c j) ICj a primitive form (11.36) 

i=i 
so that 

m 
E[Z]=E [g (X)} = ]T g ( Cj ) P (Cj) (11.37) 

Alternately, we may use csort to determine the distribution for Z and work with that distribution. 
Caution. If X is in afHne form (but not a primitive form) 

m m 

X = c + J2 c J I E j then g(X)^g(c ) + J29(cj)lE j (11.38) 

J=l 3=1 

so that 

m 

E [g (X)] ^g (c ) + J2 9 ( C 3) P ( E 3) (H-39) 

3 = 1 

Example 11.9: Expectation of a function of X 

Suppose X in a primitive form is 

X = -3/ Cl - Ic 2 + He, ~ 3/c 4 + 4/c 6 - Ic B + Ic T + 2I c 8 + 3/c 9 + 2/ Cl0 (11.40) 

with probabilities P (d) = 0.08, 0.11, 0.06, 0.13, 0.05, 0.08, 0.12, 0.07, 0.14, 0.16. 
Let g (t) = t 2 + It. Determine E [g (X)]. 
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c = [-3-12-3 4- 
pc = 0.01* [8 11 6 13 5 
G = c.~2 + 2*c 
G = 3 -1 8 3 24 
EG = G*pc' 
EG = 6.4200 
[Z,PZ] = csort(G,pc) ; 
disp([Z;PZ]') 



-1.0000 





1900 


3.0000 





3300 


8.0000 





2900 


15.0000 





1400 


24.0000 





0500 


EZ = Z*PZ' 






EZ = 6.4200 







12 3 2]; 

12 7 14 16] ; 

-1 3 8 15 



'/, Original coefficients 
'/, Probabilities for C_j 
'/. g(c_j) 



'/, Direct computation 

'/, Distribution for Z 
7, Optional display 



= g(X) 



'/, E[Z] from distribution for Z 



A similar approach can be made to a function of a pair of simple random variables, provided the joint 
distribution is available. Suppose X = Y^=i UlA t an d Y = J2T=i u j^B j (both in canonical form). Then 



Z = g (X, Y) = Y, Y, 9 (U, Uj) I AzBj 
»=i j=i 



(11.41) 



The AiBj form a partition, so Z is in a primitive form. We have the same two alternative possibilities: (1) 
direct calculation from values of g{t%, Uj) and corresponding probabilities P (AiBj) = P (X = ti, Y = Uj), 
or (2) use of csort to obtain the distribution for Z. 

Example 11.10: Expectation for Z = g(X, Y) 

We use the joint distribution in file jdemol.m and let g(t, u) = t 2 + 2tu — 3m. To set up for 
calculations, we use jcalc. 

'/, file jdemol.m 
X = [-2.37 -1.93 -0.47 -0.11 0.57 1.22 2.15 2.97 3.74]; 
Y = [-3.06 -1.44 -1.21 0.07 0.88 1.77 2.01 2.84]; 
P = 0.0001*[ 53 8 167 170 184 18 67 122 18 12; 
11 13 143 221 241 153 87 125 122 185; 
165 129 226 185 89 215 40 77 93 187; 
165 163 205 64 60 66 118 239 67 201; 
227 2 128 12 238 106 218 120 222 30; 
93 93 22 179 175 186 221 65 129 4; 
126 16 159 80 183 116 15 22 113 167; 
198 101 101 154 158 58 220 230 228 211]; 



jdemol '/, Call for data 

jcalc '/, Set up 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
G = t.~2 + 2*t.*u - 3*u; '/, Calculation of matrix of [g(t_i, u_j)] 
EG = total (G.*P) '/, Direct calculation of expectation 

EG = 3.2529 

[Z,PZ] = csort(G,P); '/, Determination of distribution for Z 

EZ = Z*PZ' '/. E[Z] from distribution 
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EZ = 3.2529 

11.2 Mathematical Expectation; General Random Variables 2 

In this unit, we extend the definition and properties of mathematical expectation to the general case. In the 
process, we note the relationship of mathematical expectation to the Lebesque integral, which is developed 
in abstract measure theory. Although we do not develop this theory, which lies beyond the scope of this 
study, identification of this relationship provides access to a rich and powerful set of properties which have 
far reaching consequences in both application and theory. 

11.2.1 Extension to the General Case 

In the unit on Distribution Approximations (Section 7.2), we show that a bounded random variable X can be 
represented as the limit of a nondecreasing sequence of simple random variables. Also, a real random variable 
can be expressed as the difference X = X + — X~ of two nonnegative random variables. The extension of 
mathematical expectation to the general case is based on these facts and certain basic properties of simple 
random variables, some of which are established in the unit on expectation for simple random variables. We 
list these properties and sketch how the extension is accomplished. 

Definition. A condition on a random variable or on a relationship between random variables is said to 
hold almost surely, abbreviated "a.s." iff the condition or relationship holds for all u> except possibly a set 
with probability zero. 

Basic properties of simple random variables 

(E0) : If X = Y a.s. then E [X] = E [Y]. 

(El) : E[aI E ] =aP(E). 

(E2) : Linearity. X = £™ =1 a t X z implies E [X] = £? =1 a t E [X t \ 

(E3) : Positivity; monotonicity 

a. If X > a.s. , then E [X] > 0, with equality iff X = a.s.. 

b. If X > Y a.s. , then E[X}> E [Y], with equality iff X = Y a.s. 

(E4) : Fundamental lemma If X > is bounded and {X n : 1 < n} is an a.s. nonnegative, nondecreasing 
sequence with UmX n {uj) > X (ui) for almost every u, then limE [X n ] > E [X]. 

n n 

(E4a): If for all n, < X n < X n+1 a.s. and X n — > X a.s. , then E [X n ] — > E [X] (i.e., the expectation of 
the limit is the limit of the expectations). 

Ideas of the proofs of the fundamental properties 

• Modifying the random variable X on a set of probability zero simply modifies one or more of the Aj 
without changing P (Ai). Such a modification does not change E [X]. 

• Properties (El) ("(El) ", p. 309) and (E2) ("(E2) ", p. 309) are established in the unit on expectation 
of simple random variables.. 

• Positivity (E3a) (p. 309) is a simple property of sums of real numbers. Modification of sets of proba- 
bility zero cannot affect the expectation. 

• Monotonicity (E3b) (p. 309) is a consequence of positivity and linearity. 

X > Y iff X-Y > a.s. and E [X] > E [Y] iff E [X] - E [Y] = E [X - Y] > (11.42) 



2 This content is available online at <http://cnx.Org/content/m23412/l.5/>. 
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• The fundamental lemma (E4) ("(E4) ", p. 309) plays an essential role in extending the concept of 
expectation. It involves elementary, but somewhat sophisticated, use of linearity and monotonicity, 
limited to nonnegative random variables and positive coefficients. We forgo a proof. 

• Monotonicity and the fundamental lemma provide a very simple proof of the monotone convergence 
theoem, often designated MC. Its role is essential in the extension. 

Nonnegative random variables 

There is a nondecreasing sequence of nonnegative simple random variables converging to X. Monotonicity 
implies the integrals of the nondecreasing sequence is a nondecreasing sequence of real numbers, which must 
have a limit or increase without bound (in which case we say the limit is infinite). We define E [X] = 
UmE[X n ]. 

n 

Two questions arise. 

1. Is the limit unique? The approximating sequences for a simple random variable are not unique, although 
their limit is the same. 

2. Is the definition consistent? If the limit random variable X is simple, does the new definition coincide 
with the old? 

The fundamental lemma and monotone convergence may be used to show that the answer to both questions 
is affirmative, so that the definition is reasonable. Also, the six fundamental properties survive the passage 
to the limit. 

As a simple applications of these ideas, consider discrete random variables such as the geometric (p) or 
Poisson (/i), which are integer- valued but unbounded. 

Example 11.11: Unbounded, nonnegative, integer-valued random variables 

The random variable X may be expressed 

DC 

X = Y^ kI Ek , where E k = {X = k} with P (E k ) = Pk (11.43) 

fe=0 

Let 

n-l 

X n = Y^ kI Ek + nI Bn , where B n = {X > n} (11.44) 

fe=0 

Then each X n is a simple random variable with X n < X„ +1 . If X (u>) = k, then X n (u>) = k = X (u>) 
for all n > k + 1. Hence, X n (ui) — » X (o>) for all to. By monotone convergence, E [X n ] — > E [X]. 
Now 

n-l 

E\X n ] = Y J kP{E k ) + nP{B n ) (11.45) 

fc=i 

If Y.T=okP{E k )<oo,t\ien 

CO CO 

< nP(B n ) = n^P(£ fe ) < YkP(E k ) -» astnco (11.46) 

k—n k—n 

Hence 

CO 

E[X] = limE[X n ] = y^kP(A k ) (11.47) 

n *- — J 

k=0 

We may use this result to establish the expectation for the geometric and Poisson distributions. 
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Example 11.12: X ~ geometric (p) 

We have pk = P (X = k) = q k p, < k. By the result of Example 11.11 (Unbounded, nonnegative, 
integer-valued random variables) 



E[X\ = J2 k P q k =pqJ2 k 1 k ^ = T^—2 = «/P ( 1L48 ) 



For y - 1 ~ geometric (p), p fc = pg^ x so that E[Y] = -E [X] = 1/p 

Example 11.13: X ~ Poisson (/x) 

We have pj, = e _Ai ^-. By the result of Example 11.11 (Unbounded, nonnegative, integer- valued 

random variables) 

oo u oo u 1 

E[X\=e~^k^ = pe"" £ j^— = ne~^ = /x (11.49) 

fc=0 ' fe=i ^ '■ 

The general case 

We make use of the fact that X = X + — X~ , where both X + and X~ are nonnegative. Then 

E [X] = E [X+] - E [X~] provided at least one of E [X + ] , E [X~] is finite (11.50) 

Definition. If both E [X + ] and E [X~] are finite, X is said to be integrable. 

The term integrable comes from the relation of expectation to the abstract Lebesgue integral of measure 
theory. 

Again, the basic properties survive the extension. The property (E0) ("(E0) ", p. 309) is subsumed in a 
more general uniqueness property noted in the list of properties discussed below. 

Theoretical note 

The development of expectation sketched above is exactly the development of the Lebesgue integral of 
the random variable X as a measurable function on the basic probability space (il, F, P), so that 



E[X]= XdP (11.51) 

J n 

As a consequence, we may utilize the properties of the general Lebesgue integral. In its abstract form, it 
is not particularly useful for actual calculations. A careful use of the mapping of probability mass to the 
real line by random variable X produces a corresponding mapping of the integral on the basic space to 
an integral on the real line. Although this integral is also a Lebesgue integral it agrees with the ordinary 
Riemann integral of calculus when the latter exists, so that ordinary integrals may be used to compute 
expectations. 

Additional properties 

The fundamental properties of simple random variables which survive the extension serve as the basis 
of an extensive and powerful list of properties of expectation of real random variables and real functions of 
random vectors. Some of the more important of these are listed in the table in Appendix E (Section 17.5). 
We often refer to these properties by the numbers used in that table. 

Some basic forms 

The mapping theorems provide a number of basic integral (or summation) forms for computation. 

1. In general, if Z = g (X) with distribution functions Fx and Fz, we have the expectation as a Stieltjes 
integral. 

E[Z] = E [g (X)} = Jg(t) F x (dt) = J uF z (du) (11.52) 
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2. If X and g (X) are absolutely continuous, the Stieltjes integrals are replaced by 

E[Z}= fg (t) f x (t) dt= [ uf z (u) du (11.53) 



where limits of integration are determined by fx or fy. Justification for use of the density function is 
provided by the Radon-Nikodym theorem — property (E19) ("(E19)", p. 600). 

3. If X is simple, in a primitive form (including canonical form), then 

rn 

E\Z] = E\ g {X)] = Y J 9(c 3 )P(C J ) (11.54) 

If the distribution for Z = g (X) is determined by a csort operation, then 

n 

E\Z] = Y J v k P{Z = v k ) (11.55) 

fe=i 

4. The extension to unbounded, nonnegative, integer-valued random variables is shown in Example 11.11 
(Unbounded, nonnegative, integer- valued random variables), above. The finite sums are replaced by 
infinite series (provided they converge). 

5. ForZ = 5 (X, Y), 

E[Z} = E [g (X, Y)] = J Jg (t, u) F XY (dtdu) = J vF z (dv) (11.56) 

6. In the absolutely continuous case 

E[Z]=E \g (X, Y)]= J f g (t, u) f XY (t, u) dudt = J vf z (v) dv (11.57) 

7. For joint simple X, Y (Section on Expectation for Simple Random Variables (Section 11.1.2: Expec- 
tation for simple random variables)) 

n m 

E\Z] = E[g{X,Y)] = Y J Y.9^u 1 )P{X = t l , Y = uj) (11.58) 

i=\ j=\ 

Mechanical interpretation and approximation procedures 

In elementary mechanics, since the total mass is one, the quantity E [X] = J tfx (t) dt is the location of 
the center of mass. This theoretically rigorous fact may be derived heuristically from an examination of the 
expectation for a simple approximating random variable. Recall the discussion of the m-procedure for discrete 
approximation in the unit on Distribution Approximations (Section 7.2) The range of X is divided into equal 
subintervals. The values of the approximating random variable are at the midpoints of the subintervals. The 
associated probability is the probability mass in the subinterval, which is approximately fx (ti)dx, where 
dx is the length of the subinterval. This approximation improves with an increasing number of subdivisions, 
with corresponding decrease in dx. The expectation of the approximating simple random variable X s is 

E [X s ] = J2 Ufx (U) dxzz J tfx (t) dt (11.59) 

i 

The approximation improves with increasingly fine subdivisions. The center of mass of the approximating 
distribution approaches the center of mass of the smooth distribution. 
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It should be clear that a similar argument for g (X) leads to the integral expression 

E[g(X)}= [g(t)f x (t)dt (11.60) 



This argument shows that we should be able to use tappr to set up for approximating the expectation 
E[g(X)] as well as for approximating P (g (X) G M), etc. We return to this in Section 11.2.2 (Properties 
and computation). 

Mean values for some absolutely continuous distributions 

1. Uniform on [a, b]fx (£) = 53^, a < t < b The center of mass is at (a + b) /2. To calculate the value 
formally, we write 

f 1 f b b 2 - n 2 b-\- a 

E[X}= tf x (t)dt=- / tdt=— r = ^~ (11-61) 

J b-a J a 2{b-a) 2 

2. Symmetric triangular on[a, b] The graph of the density is an isoceles triangle with base on the 
interval [a, b\. By symmetry, the center of mass, hence the expectation, is at the midpoint (a + b) /2. 

3. Exponential(A). fx {t) = Ae~ At , < t Using a well known definite integral (see Appendix B 
(Section 17.2)), we have 

E[X]= J tf x (t) dt= J \te~ xt dt=l/\ (11.62) 

4. Gamma(a, A), fx {t) = j=rrjrt a_1 A 0! e _A *, < t Again we use one of the integrals in Appendix B 
(Section 17.2) to obtain 

E [X] = f tf x (t) dt = -i- r A«re- At dt = T [Zt\ ) = a/A (11.63) 

J T (a) J XT (a) 

The last equality comes from the fact that T (a + 1) = aT (a). 

5. Beta(r, s). f x (t) = ^wjh t r - 1 (l - t) 8 " 1 , < t < 1 We use the fact that J^ u^l - u) 8 " 1 du = 

T(r+s) ' T > U ' S > U - 

J T(r)T(s) J T(r)T(sj T (r + s + 1) r + s 

6. Weibull(a, A, v). F x (t) = 1 - e^^"^ a > 0, A > 0, v > 0,t > v. Differentiation shows 

f x (t) = a \(t - ^) Q ~V A (*^ Q , t>v (11.65) 

First, consider Y ~ exponential (A). For this random variable 

r°° r ( r 4. I s ! 

E [Y r ] = / t r Ae~ A * dt = l ; (11.66) 

Jo ^ r 

If Y is exponential (1), then techniques for functions of random variables show that [j-Y~\ + v ~ 
Weibull (a, A, v). Hence, 

E [X] = ^j- a E [y**] + „ = _^r (i + l) + „ (11.67) 
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7. Normal(/i, <r 2 ) The symmetry of the distribution about t = fi shows that E [X] = /i. This, of course, 
may be verified by integration. A standard trick simplifies the work. 



E[X}= tf x {t)dt= (t-n)f x (t)dt + fi (11.68) 

J — oo J— oo 

We have used the fact that J_ fx {t) dt = 1. If we make the change of variable x = t — fi in the last 
integral, the integrand becomes an odd function, so that the integral is zero. Thus, E [X] = /i. 



11,2,2 Properties and computation 

The properties in the table in Appendix E (Section 17.5) constitute a powerful and convenient resource for 
the use of mathematical expectation. These are properties of the abstract Lebesgue integral, expressed in 
the notation for mathematical expectation. 

E[g(X)] = Jg(X)dP (11.69) 

In the development of additional properties, the four basic properties: (El) ("(El) ", p. 309) Expectation 
of indicator functions, (E2) ("(E2) ", p. 309) Linearity, (E3) ("(E3) ", p. 309) Positivity; monotonicity, and 
(E4a) ("(E4a)", p. 309) Monotone convergence play a foundational role. We utilize the properties in the 
table, as needed, often referring to them by the numbers assigned in the table. 

In this section, we include a number of examples which illustrate the use of various properties. Some 
are theoretical examples, deriving additional properties or displaying the basis and structure of some in the 
table. Others apply these properties to facilitate computation 

Example 11.14: Probability as expectation 

Probability may be expressed entirely in terms of expectation. 

• By properties (El) ("(El) ", p. 309) and positivity (E3a) ("(E3) ", p. 309), P (A) = E [I A ] > 
0. 

• As a special case of (El) ("(El) ", p. 309), we have P (O) = E [I Q ] = 1 

• By the countable sums property (E8) ("(E8) ", p. 600), 



A= \J A t implies P {A) = E [I A ] = E 



E'* 



Y J E[I At ] = Y J P{A l ) (11.70) 



Thus, the three defining properties for a probability measure are satisfied. 

Remark. There are treatments of probability which characterize mathematical expectation with properties 
(E0) through (E4a) (p. 309), then define P (A) = E [I A ]. Although such a development is quite feasible, it 
has not been widely adopted. 

Example 11.15: An indicator function pattern 

Suppose X is a real random variable and E = X" 1 (M) = {uj : X (ui) € M}. Then 

I E = I M (X) (11.71) 

To see this, note that X (w) G M iff uj £ E, so that I E (u>) = 1 iff I M {X (w)) = 1. 

Similarly, if E = X^ 1 (M) n Y- 1 (N), then I E = I M (X) I N (Y). We thus have, by (El) ("(El) 
", p. 309), 

P{X eM) = E[I M {X)} and P (X e M, Y € N) = E [I M (X) I N (Y)] (11.72) 
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Example 11.16: Alternate interpretation of the mean value 

E \{X - c) 2 ] is a minimum iff c = E[X], in which case E \(X - E \X\f\ = E [X 2 ] - E 2 \X] (11.73) 

INTERPRETATION. If we approximate the random variable X by a constant c, then for any lo 
the error of approximation is X (u>) — c. The probability weighted average of the square of the error 

(often called the mean squared error) is E \(X — c) . This average squared error is smallest iff 

the approximating constant c is the mean value. 

VERIFICATION 

We expand (X — c) and apply linearity to obtain 

E \{X - cf] = E [X 2 - 2cX + c 2 } = E [X 2 ] - 2E [X] c+c 2 (11.74) 

The last expression is a quadratic in c (since E [X 2 ] and E [X] are constants). The usual calculus 
treatment shows the expression has a minimum for c = E [X]. Substitution of this value for c shows 
the expression reduces to E [X 2 ] — E 2 [X]. 

A number of inequalities are listed among the properties in the table. The basis for these inequalities is 
usually some standard analytical inequality on random variables to which the monotonicity property is 
applied. We illustrate with a derivation of the important Jensen's inequality. 

Example 11.17: Jensen's inequality 

If X is a real random variable and g is a convex function on an interval I which includes the range 
of X, then 

g(E[X])<E[g(X)} (11.75) 

VERIFICATION 

The function g is convex on I iff for each to € I = [a, b] there is a number A (to) such that 

g(t)>g(t ) + X(to)(t-to) (H-76) 

This means there is a line through (to,g(to)) such that the graph of g lies on or above it. If 
a < X < b, then by monotonicity E [a] = a < E [X] < E [b] = b (this is the mean value property 
(Ell) ("(Ell)", p. 600)). We may choose t = E [X] e I. If we designate the constant A (E [X]) 
by c, we have 

g(X)>g(E[X}) + c(X-E[X}) (11.77) 

Recalling that E [X] is a constant, we take expectation of both sides, using linearity and mono- 
tonicity, to get 

E [g (X)} >g(E[X}) + c(E [X] - E [X]) = g (E [X]) (11.78) 

Remark. It is easy to show that the function A (•) is nondecreasing. This fact is used in establishing Jensen's 
inequality for conditional expectation. 

The product rule for expectations of independent random variables 

Example 11.18: Product rule for simple random variables 

Consider an independent pair {X, Y} of simple random variables 

n m 

X = 2_\^ A i Y = a2i W j^ b j (b°th m canonical form) (11.79) 

8 = 1 j = l 
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We know that each pair {Ai, Bj} is independent, so that P (AiBj) = P (Ai) P (Bj). Consider the 
product XY. According to the pattern described after Example 9 (Example 11.9: Expectation of 
a function of X) from "Mathematical Expectation: Simple Random Variables." 

n m n m 

XY = J2 UIju Y, u J Ib i = J2 12 t^^B; (11-80) 

i— 1 j — 1 i— 1 j — 1 

The latter double sum is a primitive form, so that 

e[xy] = EtiET^t^PiABj) = Er=iEr=i^m)m) = u^i) 

(ELi UP (A t )) (E7 =1 u d P {Bj)) =E[X]E [Y] 

Thus the product rule holds for independent simple random variables. 

Example 11.19: Approximating simple functions for an independent pair 

Suppose {X, Y} is an independent pair, with an approximating simple pair {X s , Y s }. As functions 
of X and Y, respectively, the pair {X s , Y s } is independent. According to Example 11.18 (Product 
rule for simple random variables), above, the product rule E [X S Y S ] = E [X s ] E [Y s ] must hold. 

Example 11.20: Product rule for an independent pair 

For X > 0, Y > 0, there exist nondecreasing sequences {X n : 1 < n) and {Y n : 1 < n) of 
simple random variables increasing to X and Y, respectively. The sequence {X n Y n : 1 < n) is 
also a nondecreasing sequence of simple random variables, increasing to XY. By the monotone 
convergence theorem (MC) 

E [X n ] [U+2197] E [X] , E [Y n ] [U+2197] E [Y] , and E [X n Y n ] [U+2197] E [XY] (11.82) 

Since E [X n Y n ] = E [X n ] E [Y n ] for each n, we conclude E [XY] = E[X]E [Y] 
In the general case, 

XY = (X+ - X-) (Y+ - Y~) = X+Y+ - X+Y- - X"Y+ + X^Y" (11.83) 

Application of the product rule to each nonnegative pair and the use of linearity gives the product 
rule for the pair {X, Y} 

Remark. It should be apparent that the product rule can be extended to any finite independent class. 

Example 11.21: The joint distribution of three random variables 

The class {X, Y, Z} is independent, with the marginal distributions shown below. Let W = 
g (AT, Y, Z) = 3A 2 + 2AY - 3AYZ. Determine E [W]. 

X = 0:4; 
Y = 1:2:7; 
Z = 0:3:12; 
PX = 0.1* [1 3 2 3 1] ; 

PY = 0.1* [2 2 3 3] ; 
PZ = 0.1* [2 2 1 3 2] ; 

icalc3 '/, Setup for joint dbn for{X,Y,Z}- 

Enter row matrix of X-values X 
Enter row matrix of Y-values Y 
Enter row matrix of Z-values Z 
Enter X probabilities PX 
Enter Y probabilities PY 
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Enter Z probabilities PZ 

Use array operations on matrices X, Y, Z, 

PX, PY, PZ, t, u, v, and P 

EX = X*PX' '/. E[X] 

EX = 2 

EX2 = (X.~2)*PX' '/. E[X"2] 

EX2 = 5.4000 

EY = Y*PY' '/. E[Y] 

EY = 4.4000 

EZ = Z*PZ' '/. E[Z] 

EZ = 6.3000 

G = 3*t.~2 + 2*t.*u - 3*t.*u.*v; '/. W = g(X,Y,Z) = 3X~2 + 2XY - 3XYZ 

EG = total(G.*P) '/. E[g(X,Y,Z)] 

EG = -132.5200 

[W,PW] = csort(G,P); '/. Distribution for W = g(X,Y,Z) 

EW = W*PW '/. E[W] 

EW = -132.5200 

ew = 3+EX2 + 2*EX*EY - 3+EX+EY+EZ '/, Use of linearity and product rule 

ew = -132.5200 

Example 11.22: A function with a compound definition: truncated exponential 

Suppose X ~ exponential (0.3). Let 

X 2 for X < 4 

Z = { ~ = I [0A] (X)X 2 + I^ oc] (X)16 (11.84) 

16 for X > 4 



Determine E \Z\. 

ANALYTIC SOLUTION 



E [g (X)} = g(t) f x (t) dt = / J [M (t) t 2 0.3e-° 3t dt + 16E [l {i>oo] (X)} (11.85) 



ii 



= / t 2 0.3e^ 03 * dt + 16P (X > 4) w 7.4972 (by Maple) (11.86) 

Jo 

APPROXIMATION 

To obtain a simple aproximation, we must approximate the exponential by a bounded random 
variable. Since P (X > 50) = e~ 15 w 3 • 10 -7 we may safely truncate X at 50. 

tappr 
Enter matrix [a b] of x-range endpoints [0 50] 
Enter number of x approximation points 1000 
Enter density as a function of t . 3*exp(-0 . 3*t) 
Use row matrices X and PX as in the simple case 
M = X <= 4; 

G = M.*X.~2 + 16*(1 - M); '/. g(X) 
EG = G*PX' '/. E[g(X)] 

EG = 7.4972 

[Z,PZ] = csort(G,PX); '/. Distribution for Z = g(X) 

EZ = Z+PZ' '/. E[Z] from distribution 

EZ = 7.4972 
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Because of the large number of approximation points, the results agree quite closely with the 
theoretical value. 

Example 11.23: Stocking for random demand (see Exercise 4 (Exercise 10.4) from 
"Problems on Functions of Random Variables") 

The manager of a department store is planning for the holiday season. A certain item costs c 
dollars per unit and sells for p dollars per unit. If the demand exceeds the amount m ordered, 
additional units can be special ordered for s dollars per unit (s > c). If demand is less than 
amount ordered, the remaining stock can be returned (or otherwise disposed of) at r dollars per 
unit (r < c). Demand D for the season is assumed to be a random variable with Poisson (/i) 
distribution. Suppose /j, = 50, c = 30, p = 50, s = 40, r = 20. What amount m should the manager 
order to maximize the expected profit? 

PROBLEM FORMULATION 

Suppose D is the demand and X is the profit. Then 

For D < to, X = D (p — c) — (in — D) (c — r) = D (p — r) + m(r — c) 
For D > to, X = to (p — c) + (D — to) (p — s) = D (p — s) + to (s — c) 

It is convenient to write the expression for X in terms of Im, where M = (— oo,to]. Thus 

X = I M (D) [D{p-r) + m{r- c)] + [1 - I M (D)] [D(p-s)+m(s- c)] (11.87) 

= D (p - s) + m (s - c) + I M {D) [D(p- r) + m(r - c) - D (p - s) - m (s - c)] (11.88) 

= D (p - s) + to (s - c) + I M (D) {s -r)(D- to) (11.89) 

Then E [X] = (p - s) E[D] + m{s - c) + (s - r) E [I M (D) D] - (s - r) mE [I M (£>)]. 
ANALYTIC SOLUTION 
For D ~ Poisson (p), E [D] = fx and E [I M (D)] = P (D < to) 

mi, rn h—'\ 



E [I M (D) D] = e"" E % = ^ £ ^ = pP (D < m - I) (11.90) 

fe=i ' fe=i ^ '' 

Hence, 

E [X] = {p - s) E [D] + to (s - c) + {s - r) E [I M (D) D] - (s - r) mE [I M (£>)] (11.91) 

= (p - s) n + to (s - c) + (s - r) liP (D < to - 1) - (s - r) mP (D < to) (11.92) 

Because of the discrete nature of the problem, we cannot solve for the optimum m by ordinary 
calculus. We may solve for various m about to = p and determine the optimum. We do so with 
the aid of MATLAB and the m-function cpoisson. 

mu = 50 ; 
c = 30 
p = 50 
s = 40 
r = 20 
m =45:55; 

EX = (p - s)*mu + m*(s -c) + (s - r)*mu*(l - cpoisson (mu, m) ) 
-(s - r)*m.*(l - cpoisson (mu, m+1) ) ; 
disp([m;EX] ') 

45.0000 930.8604 
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7, Optimum m = 50 



46.0000 935.5231 

47.0000 939.1895 

48.0000 941.7962 

49.0000 943.2988 

50.0000 943.6750 

51.0000 942.9247 

52.0000 941.0699 

53.0000 938.1532 

54.0000 934.2347 

55.0000 929.3886 

A direct, solution may be obtained by MATLAB, using finite approximation for the Poisson distri- 
bution. 

APPROXIMATION 

ptest = cpoisson(mu, 100) '/, Check for suitable value of n 
ptest = 3.2001e-10 
n = 100; 
t = 0:n; 

pD = ipoisson(mu,t) ; 
for i = l:length(m) °/„ Step by step calculation for various m 

M = t > m(i); 

G(i,:) = t*(p - r) - M.*(t - m(i))*(s - r)- m(i)*(c - r) ; 
end 
EG = G*pD'; '/, Values agree with theoretical to four deicmals 

An advantage of the second solution, based on simple approximation to D, is that the distribution of gain 
for each m could be studied — e.g., the maximum and minimum gains. 

— □ 

Example 11.24: A jointly distributed pair 

Suppose the pair {X, Y} has joint density fxy {t, u) = 3w on the triangular region bounded by 
u = 0,u=l + t,u=l-t (see Figure 11.2). Let Z = g (X, Y) = X 2 + 2XY . Determine E \Z\. 




f XY (t,u) = 3u on the triangle 



Figure 11.2: The density for Example 11.24 (A jointly distributed pair) 
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ANALYTIC SOLUTION 

E[Z] = J J (t 2 + 2tu) fxY (t, u) dudt 

/ (t 2 u + 2tu 2 ) dudt + 3 / (t 2 u + 2tu 2 ) dudt = 1/10 (11.93) 
-l Jo Jo Ja 

APPROXIMATION 

tuappr 
Enter matrix [a b] of X-range end/points [-1 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 400 
Enter number of Y approximation points 200 
Enter expression for joint density 3*u. * (u<=min(l+t , 1-t) ) 
Use array operations on X, Y, PX, PY, t, u, and P 
G = t.~2 + 2*t.*u; '/. g(X,Y) = X~2 + 2XY 

EG = total (G.*P) '/. E[g(X,Y)] 

EG = 0.1006 '/. Theoretical value = 1/10 

[Z,PZ] = csort(G,P); '/. Distribution for Z 

EZ = Z*PZ' '/. E[Z] from distribution 

EZ = 0.1006 

Example 11.25: A function with a compound definition 

The pair {X, Y} has joint density fxy (t, u) = 1/2 on the square region bounded by u = 1 + t, 
u= 1 — t, u = 3 — t, and u = t — 1 (see Figure 11.3), 

X formax{X 7 Y\ < 1 
W = { " =Iq(X,Y)X + I Q o(X,Y)2Y (11.94) 

2Y iormax{X, Y] > 1 

where Q = {(£, u) : max{t, u} < 1} = {(t, u) : t < l,u < 1}. Determine E [W]. 
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Figure 11.3: The density for Example 11.25 (A function with a compound definition) 



ANALYTIC SOLUTION 

The intersection of the region Q and the square is the set for which < t < 1 and 1 — t < u < 1. 
Reference to the figure shows three regions of integration. 



E[W] 



1 

2 jo Ji-t 
APPROXIMATION 



t dudt 



1 

2./o 



i+/. 



2 m dudt 



1 

2 ./, 



2ududt= 11/6 w 1.8333 



(11.95) 



tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density ( (u<=min(t+l ,3-t) )i 

(u>=max(l-t,t-l)))/2 
Use array operations on X, Y, PX, PY, t, u, and P 
M = max(t,u)<=l; 

M); '/. Z = g(X,Y) 
1 E[g(X,Y)] 

'/. Theoretical 11/6 = 1.8333 
7, Distribution for Z 
7, E[Z] from distribution 



G = t.*M + 2*u.*(l - 
EG = total (G.*P) 
EG = 1.8340 
[Z,PZ] = csort(G,P) ; 
EZ = dot(Z,PZ) 
EZ = 1.8340 



Special forms for expectation 
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The various special forms related to property (E20a) (list, p. 600) are often useful. The general result, 
which we do not need, is usually derived by an argument which employs a general form of what is known as 
Fubini's theorem. The special form (E20b) (list, p. 601) 

/oo 
[u(t)-F x (t)]dt (11.96) 

-oo 

may be derived from (E20a) by use of integration by parts for Stieltjes integrals. However, we use the 
relationship between the graph of the distribution function and the graph of the quantile function to show 
the equivalence of (E20b) (list, p. 601) and (E20f) (list, p. 601). The latter property is readily established 
by elementary arguments. 

Example 11.26: The property (E20f) (list, p. 601) 

If Q is the quantile function for the distribution function Fx, then 

E[g(X)}= f g[Q(u)}du (11.97) 

Jo 

VERIFICATION 

If Y = Q (£/), where U ~ uniform on (0, 1), then Y has the same distribution as X. Hence, 

E [g (X)} = E[g(Q (17))] = J g (Q («)) fu (u) du = j g (Q («)) du (11.98) 

Example 11.27: Reliability and expectation 

In reliability, if X is the life duration (time to failure) for a device, the reliability function is the 
probability at any time t the device is still operative. Thus 

R(t) =P(X >t) = l-F x {t) (11.99) 

According to property (E20b) (list, p. 601) 

/>OC 

E[X]= / R(t) dt (11.100) 

Jo 

Example 11.28: Use of the quantile function 

Suppose F x {t) = t a 1 a > 0, < t < 1. Then Q (u) = u 1/q , < u < a. 



r 1 i 

E\X}= u 1/a du= ;- = -=- (11.101) 

1 J Jo l + l/o o+l V ; 



The same result could be obtained by using fx (t) = F x (t) and evaluating J tfx (t) dt. 

Example 11.29: Equivalence of (E20b) (list, p. 601) and (E20f) (list, p. 601) 

For the special case g (X) = X, Figure 3(a) shows J Q (u) du is the difference in the shaded areas 



/ Q (u) du = Area A - Area B (11.102) 

Jo 



The corresponding graph of the distribution function F is shown in Figure 3(b) (Figure 11.4). 
Because of the construction, the areas of the regions marked A and B are the same in the two 
figures. As may be seen, 



/•oo /*0 

Area A = / [1 - F (£)] dt and Area B = / F 

Jo 7-oo 



(t) dt (11.103) 
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Use of the unit step function u(t) = 1 for t > and for t < (defined arbitrarily at t = 0) 
enables us to combine the two expressions to get 

t'l f'OO 

I Q (u) du = Area A - Area B = / [u (t) - F (t)} dt (11.104) 

JO J -co 
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Figure 11.4: Equivalence of properties (E20b) (list, p. 601) and (E20f) (list, p. 601). 



Property (E20c) (list, p. 601) is a direct result of linearity and (E20b) (list, p. 601), with the unit step 
functions cancelling out. 
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Example 11.30: Property (E20d) (list, p. 601) Useful inequalities 

Suppose X > 0. Then 

CO CO CO 

^2 P (X > n + 1) < E [X] < ^ P (X > n) < N^2 P (X > kN) , for all N > 1 (11.105) 

n=0 n=0 fe=0 

VERIFICATION 

For X > 0, by (E20b) (list, p. 601) 

/•CO /*CO 

E[X]= [1-F(t)]dt = P{X>t)dt (11.106) 

Jo Jo 

Since F can have only a countable number of jumps on any interval and P (X > t) and P (X > t) 
differ only at jump points, we may assert 

b ,-b 

P(X>t)dt = P(X>t)dt (11.107) 

J a 

For each nonnegative integer n, let E n = [n, n + 1). By the countable additivity of expectation 

CO CO „ 

E [X] = Y] E [I E „X] = J2 P(X >t) dt (11.108) 

n=0 n=0^ E " 

Since P (X > t) is decreasing with t and each E n has unit length, we have by the mean value 
theorem 

P {X > n + 1) < E [I En X] <P{X>n) (11.109) 

The third inequality follows from the fact that 

f(k+l)N 

>kN ■' E kN 



f (k+l)N , 

/ P{X>t)dt<N P(X >t) dt< NP{X >kN) (11.110) 

JkN J E hN 



Remark. Property (E20d) (list, p. 601) is used primarily for theoretical purposes. The special case (E20e) 
(list, p. 601) is more frequently used. 

Example 11.31: Property (E20e) (list, p. 601) 

If X is nonnegative, integer valued, then 

oo oo 

E[X} = Y J p {X >k) = Y J p {X > k) (11.111) 

fc=l fe=0 

VERIFICATION 

The result follows as a special case of (E20d) (list, p. 601). For integer valued random variables, 

P(X > t) = P(X > n) on E n and P (X > t) = P {X > n) = P {X > n + 1) on£„+i (11.112) 

An elementary derivation of (E20e) (list, p. 601) can be constructed as follows. 

Example 11.32: (E20e) (list, p. 601) for integer-valued random variables 

By definition 

oo n 

E[X} = Y^ kP (X = k) = Urn ^kP(X = k) (11.113) 

fe=i fe=i 
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Now for each finite n, 

n n k n n n 

j2kP(x = k) = j2J2 p ( x = v = 12I2 p ( x = v = J2 p ( x ^fi ( n - 114 ) 

k — 1 k— 1 j — 1 j— 1 fc— j j—1 

Taking limits as n — » oo yields the desired result. 

Example 11.33: The geometric distribution 

Suppose X ~ geometric (p). Then P (X > fc) = g fe . Use of (E20e) (list, p. 601) gives 

oo oo 

E\x] = Y J q k = ^Y. c i k = ^ = < i/p ( n - 115 ) 

fc = l fe=0 y 



11.3 Problems on Mathematical Expectation 3 

Exercise 11.1 (Solution on p. 334.) 

(See Exercise 1 (Exercise 7.1) from "Problems on Distribution and Density Functions", m-file 
npr07_01.m (Section 17.8.30: npr07_01)). The class {C 3 : 1 < j < 10} is a partition. Random 
variable X has values {1,3,2,3,4,2,1,3,5,2} on Cj through Cio, respectively, with probabilities 
0.08, 0.13, 0.06, 0.09, 0.14, 0.11, 0.12, 0.07, 0.11, 0.09. Determine E[X]. 

Exercise 11.2 (Solution on p. 334.) 

(See Exercise 2 (Exercise 7.2) from "Problems on Distribution and Density Functions", m-file 
npr07_02.m (Section 17.8.31: npr07_02) ). A store has eight items for sale. The prices are $3.50, 
$5.00, $3.50, $7.50, $5.00, $5.00, $3.50, and $7.50, respectively. A customer comes in. She purchases 
one of the items with probabilities 0.10, 0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable 
expressing the amount of her purchase may be written 

X = 3.5/ Cl + 5.0/ C2 + 3.5/c 3 + 7.5/ C4 + 5.0/ Cr , + 5.0/ Cfi + 3.5/ C7 + 7.5/ Cs (11.116) 

Determine the expection E [X] of the value of her purchase. 

Exercise 11.3 (Solution on p. 334.) 

(See Exercise 12 (Exercise 6.12) from "Problems on Random Variables and Probabilities", and Ex- 
ercise 3 (Exercise 7.3) from "Problems on Distribution and Density Functions," m-file npr06_12.m 
(Section 17.8.28: npr06_12)). The class {A, B, C, D} has minterm probabilities 

pm = 0.001 * [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] (11.117) 

Determine the mathematical expection for the random variable X = I a + Ib + Ic + Id, which 
counts the number of the events which occur on a trial. 

Exercise 11.4 (Solution on p. 335.) 

(See Exercise 5 (Exercise 7.5) from "Problems on Distribution and Density Functions"). In a thun- 
derstorm in a national park there are 127 lightning strikes. Experience shows that the probability 
of of a lightning strike starting a fire is about 0.0083. Determine the expected number of fires. 

Exercise 11.5 (Solution on p. 335.) 

(See Exercise 8 (Exercise 7.8) from "Problems on Distribution and Density Functions"). Two coins 
are flipped twenty times. Let X be the number of matches (both heads or both tails). Determine 
E[X\. 



3 This content is available online at <http://cnx.Org/content/m24366/l.4/>. 
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Exercise 11.6 (Solution on p. 335.) 

(See Exercise 12 (Exercise 7.12) from "Problems on Distribution and Density Functions"). A 
residential College plans to raise money by selling "chances" on a board. Fifty chances are sold. A 
player pays $10 to play; he or she wins $30 with probability p = 0.2. The profit to the College is 

X = 50 • 10 — 30^, where N is the number of winners (11.118) 

Determine the expected profit E [X]. 

Exercise 11.7 (Solution on p. 335.) 

(See Exercise 19 (Exercise 7.19) from "Problems on Distribution and Density Functions"). The 
number of noise pulses arriving on a power circuit in an hour is a random quantity having Poisson 
(7) distribution. What is the expected number of pulses in an hour? 

Exercise 11.8 (Solution on p. 335.) 

(See Exercise 24 (Exercise 7.24) and Exercise 25 (Exercise 7.25) from "Problems on Distribution 
and Density Functions"). The total operating time for the units in Exercise 24 (Exercise 7.24) is a 
random variable T ~ gamma (20, 0.0002). What is the expected operating time? 

Exercise 11.9 (Solution on p. 335.) 

(See Exercise 41 (Exercise 7.41) from "Problems on Distribution and Density Functions"). Random 
variable X has density function 

(6/5) t 2 for < t < 1 6 o 6 

fx(t) = { )' "- =10,1 }(t )t 2 + I (1 , 2] (t)- 2-i 11.119 

(6/5) (2 - t) for 1 < t < 2 5 5 

What is the expected value E \X]1 
Exercise 11.10 (Solution on p. 335.) 

Truncated exponential. Suppose X ~ exponential (A) and Y = I[o, a ] (X) X + I< a ,oo) (X) a. 

a. Use the fact that 

1 l'°° 1 

te~ xt dt=— and / te~ xt dt = — e' Xa (1 + Xa) (11.120) 

to determine an expression for E [Y]. 

b. Use the approximation method, with A = 1/50, a = 30. Approximate the exponential at 
10,000 points for < t < 1000. Compare the approximate result with the theoretical result 
of part (a) . 

Exercise 11.11 (Solution on p. 335.) 

(See Exercise 1 (Exercise 8.1) from "Problems On Random Vectors and Joint Distributions", m-file 
npr08_01.m (Section 17.8.32: npr08_01)). Two cards are selected at random, without replacement, 
from a standard deck. Let X be the number of aces and Y be the number of spades. Under the 
usual assumptions, determine the joint distribution. Determine E [X], E [Y], E [X 2 ] , E [Y 2 ~\ , and 
E[XY]. 

Exercise 11.12 (Solution on p. 336.) 

(See Exercise 2 (Exercise 8.2) from "Problems On Random Vectors and Joint Distributions", m- 
file npr08_02.m (Section 17.8.33: npr08_02) ). Two positions for campus jobs are open. Two 
sophomores, three juniors, and three seniors apply. It is decided to select two at random (each 
possible pair equally likely). Let X be the number of sophomores and Y be the number of juniors 
who are selected. Determine the joint distribution for {X, Y} and E [X], E [Y], E [X 2 ] , E [Y 2 ] , 
and E[XY}. 
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Exercise 11.13 (Solution on p. 336.) 

(See Exercise 3 (Exercise 8.3) from "Problems On Random Vectors and Joint Distributions", m-file 
npr08_03.m (Section 17.8.34: npr08_03) ). A die is rolled. Let X be the number of spots that 
turn up. A coin is flipped X times. Let Y be the number of heads that turn up. Determine the 
joint distribution for the pair {X, Y}. Assume P (X = k) = 1/6 for 1 < k < 6 and for each k, 
P (Y = j\X = k) has the binomial (fc, 1/2) distribution. Arrange the joint matrix as on the plane, 
with values of Y increasing upward. Determine the expected value E [Y]. 

Exercise 11.14 (Solution on p. 336.) 

(See Exercise 4 (Exercise 8.4) from "Problems On Random Vectors and Joint Distributions", m-file 
npr08_04.m (Section 17.8.35: npr08_04) ). As a variation of Exercise 11.13, suppose a pair of dice 
is rolled instead of a single die. Determine the joint distribution for {X, Y} and determine E [Y]. 

Exercise 11.15 (Solution on p. 337.) 

(See Exercise 5 (Exercise 8.5) from "Problems On Random Vectors and Joint Distributions", m-file 
npr08_05.m (Section 17.8.36: npr08_05)). Suppose a pair of dice is rolled. Let X be the total 
number of spots which turn up. Roll the pair an additional X times. Let Y be the number of 
sevens that are thrown on the X rolls. Determine the joint distribution for {X, Y} and determine 
E[Y]. 
Exercise 11.16 (Solution on p. 337.) 

(See Exercise 6 (Exercise 8.6) from "Problems On Random Vectors and Joint Distributions", m-file 
npr08_06.m (Section 17.8.37: npr08_06)). The pair {X, Y} has the joint distribution: 



A" =[-2.3 -0.7 1.1 3.9 5.1] Y = [1.3 2.5 4.1 5.3] 



(11.121) 



0.0483 0.0357 0.0420 0.0399 0.0441 

0.0437 0.0323 0.0380 0.0361 0.0399 

0.0713 0.0527 0.0620 0.0609 0.0551 

0.0667 0.0493 0.0580 0.0651 0.0589 



(11.122) 



Determine E [X], E [Y], E [X 2 ], E [Y 2 ], and E [XY]. 

Exercise 11.17 (Solution on p. 337.) 

(See Exercise 7 (Exercise 8.7) from "Problems On Random Vectors and Joint Distributions", m-file 
npr08_07.m (Section 17.8.38: npr08_07)). The pair {X, Y} has the joint distribution: 



P(X = t, Y = u) 



(11.123) 



t = 


-3.1 


-0.5 


1.2 


2.4 


3.7 


4.9 


u = 7.5 


0.0090 


0.0396 


0.0594 


0.0216 


0.0440 


0.0203 


4.1 


0.0495 





0.1089 


0.0528 


0.0363 


0.0231 


-2.0 


0.0405 


0.1320 


0.0891 


0.0324 


0.0297 


0.0189 


-3.8 


0.0510 


0.0484 


0.0726 


0.0132 





0.0077 



Table 11.1 

Determine E [X], E [Y], E [X 2 ] , E [Y 2 ] , and E [XY]. 
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Exercise 11.18 (Solution on p. 337.) 

(See Exercise 8 (Exercise 8.8) from "Problems On Random Vectors and Joint Distributions", m-file 
npr08_08.m (Section 17.8.39: npr08_08)). The pair {X, Y} has the joint distribution: 



P(X = t, Y = u) 



(11.124) 



t = 


1 


3 


5 


7 


9 


11 


13 


15 


17 


19 


u = 12 


0.0156 


0.0191 


0.0081 


0.0035 


0.0091 


0.0070 


0.0098 


0.0056 


0.0091 


0.0049 


10 


0.0064 


0.0204 


0.0108 


0.0040 


0.0054 


0.0080 


0.0112 


0.0064 


0.0104 


0.0056 


9 


0.0196 


0.0256 


0.0126 


0.0060 


0.0156 


0.0120 


0.0168 


0.0096 


0.0056 


0.0084 


5 


0.0112 


0.0182 


0.0108 


0.0070 


0.0182 


0.0140 


0.0196 


0.0012 


0.0182 


0.0038 


3 


0.0060 


0.0260 


0.0162 


0.0050 


0.0160 


0.0200 


0.0280 


0.0060 


0.0160 


0.0040 


-1 


0.0096 


0.0056 


0.0072 


0.0060 


0.0256 


0.0120 


0.0268 


0.0096 


0.0256 


0.0084 


-3 


0.0044 


0.0134 


0.0180 


0.0140 


0.0234 


0.0180 


0.0252 


0.0244 


0.0234 


0.0126 


-5 


0.0072 


0.0017 


0.0063 


0.0045 


0.0167 


0.0090 


0.0026 


0.0172 


0.0217 


0.0223 



Table 11.2 

Determine E [X], E [Y], E [X 2 ] , E [Y 2 ] , and E [XY]. 

Exercise 11.19 (Solution on p. 338.) 

(See Exercise 9 (Exercise 8.9) from "Problems On Random Vectors and Joint Distributions", m-file 
npr08_09.m (Section 17.8.40: npr08_09)). Data were kept on the effect of training time on the 
time to perform a job on a production line. X is the amount of training, in hours, and Y is the 
time to perform the task, in minutes. The data are as follows: 



P(X = t, Y = u) 



(11.125) 



t = 


1 


1.5 


2 


2.5 


3 


u = 5 


0.039 


0.011 


0.005 


0.001 


0.001 


4 


0.065 


0.070 


0.050 


0.015 


0.010 


3 


0.031 


0.061 


0.137 


0.051 


0.033 


2 


0.012 


0.049 


0.163 


0.058 


0.039 


1 


0.003 


0.009 


0.045 


0.025 


0.017 



Table 11.3 



Determine E [X], E [Y], E [X 2 ] , E [Y 2 ] , and E [XY]. 
For the joint densities in Exercises 20-32 below 

a. Determine analytically E [X], E [Y], E [X 2 ] , E [Y 2 ] , and E [XY]. 

b. Use a discrete approximation for E [X], E [Y], E [X 2 ] , E [Y 2 ] , and E [XY]. 
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Exercise 11.20 (Solution on p. 338.) 

(See Exercise 10 (Exercise 8.10) from "Problems On Random Vectors and Joint Distributions"). 
f XY (t,u) = 1 for < t < 1, < u < 2(1- t). 

fx(t) = 2(l-t), < i < 1, f Y (u) = l-u/2, 0<u<2 (11.126) 

Exercise 11.21 (Solution on p. 338.) 

(See Exercise 11 (Exercise 8.11) from "Problems On Random Vectors and Joint Distributions"). 
Ixy (t,u) = 1/2 on the square with vertices at (1,0) , (2,1), (1,2), (0,1). 

fx (t) = h (t) = / [0 ,i] (t) t + /(i, 2] (t) (2 - t) (11.127) 

Exercise 11.22 (Solution on p. 339.) 

(See Exercise 12 (Exercise 8.12) from "Problems On Random Vectors and Joint Distributions"). 
f XY (t,u) =4£(l-«) for < t < 1, < u< 1. 

f x (t) = 2t, < i < 1, f Y (u) = 2(l-u), 0<m<1 (11.128) 

Exercise 11.23 (Solution on p. 339.) 

(See Exercise 13 (Exercise 8.13) from "Problems On Random Vectors and Joint Distributions"). 

f XY (t, u) = I (t + u) for < t < 2, < u < 2. 

fx{t) = f Y {t) = \{t+l), < t < 2 (11.129) 

Exercise 11.24 (Solution on p. 339.) 

(See Exercise 14 (Exercise 8.14) from "Problems On Random Vectors and Joint Distributions"). 
f XY (t, u) = Aue~ 2t for < t, < u < 1. 

f x (t) = 2er 2t , 0<t, f Y (u) = 2u, < u < 1 (11.130) 

Exercise 11.25 (Solution on p. 339.) 

(See Exercise 15 (Exercise 8.15) from "Problems On Random Vectors and Joint Distributions"). 
f XY (t, u) = ^ (2t + 3m 2 ) for < t < 2, < u < 1 + 1. 

fx (t) = ^(l + t)(l + 4t + t 2 ) = ^-(l + 5t+5t 2 + t 3 ), 0<t<2 (11.131) 

h («) = /[o,i] («) ^ (6« 2 + 4) + / (1;3] («) A (3 + 2u + 8u 2 - 3u 3 ) (11.132) 

Exercise 11.26 (Solution on p. 339.) 

(See Exercise 16 (Exercise 8.16) from "Problems On Random Vectors and Joint Distributions"). 
fxY {t,u) = I2t 2 u on the parallelogram with vertices 

(-1,0), (0,0), (1,1), (0,1) (11.133) 

f x (t) = I hh0] {t)6t 2 {t+lf + I (0 , 1] {t)6t 2 (l-t 2 ), f Y (u) = I2u 3 - I2u 2 + Au, 0<u<l (11.134) 
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Exercise 11.27 (Solution on p. 339.) 

(See Exercise 17 (Exercise 8.17) from "Problems On Random Vectors and Joint Distributions"). 
f XY (t,u) = jjtu for < t < 2, < u < min{\,2 - t). 

fx (t) = J [0 ,i] (t) ^t + J (lj2] (t) ^t(2 - t)\ f Y (u) = ^u(u - 2) 2 , < u < 1 (11.135) 

Exercise 11.28 (Solution on p. 340.) 

(See Exercise 18 (Exercise 8.18) from "Problems On Random Vectors and Joint Distributions"). 
f XY (f, u) = ^ (t + 2m) for < t < 2, < u < max{2 - t, t}. 

fx (t) = /[<,,!] (t) | (2 - t) + / (1)2] (t) |t 2 , jy (u) = T m (u) f (2« + 1) + (11.136) 
/(i, 2 ] («) | (4 + 6« - 4w 2 ) 

Exercise 11.29 (Solution on p. 340.) 

(See Exercise 19 (Exercise 8.19) from "Problems On Random Vectors and Joint Distributions"). 
f XY (t,u) = jfg (3t 2 + u), for < t < 2, < u < min{2,3-t}. 

fx (t) = / [0 ,i] (t) ^ (3i 2 + 1) + J (1 , 2] (t) A (g _ 6i + i 9f 2 _ 6f 3) (n 13?) 

24 12 

/V (u) = / [0 ,i] («) ^ (4 + u) + /(i,2] («) ^ (27 - 24« + 8u 2 - u 3 ) (11.138) 

Exercise 11.30 (Solution on p. 340.) 

(See Exercise 20 (Exercise 8.20) from "Problems On Random Vectors and Joint Distributions"). 
f XY (t,u) = ^{M+2tu), for < t < 2, < u< min{l + t,2}. 

12 120 

fx (t) = / [0 ,i] (t) — (t 3 + 5t 2 + At) + J (1)2] (t) — * (11.139) 

/y («) = /[o,i] («) ^ (2m + 3) + 7 (li2] («) ^ (2m + 3) (3 + 2u - u 2 ) (11.140) 

= l m (u) ^ (2u + 3) + 7 (1>2] (u) A (g + 12u + M 2 _ 2u 3) (n m) 

Exercise 11.31 (Solution on p. 340.) 

(See Exercise 21 (Exercise 8.21) from "Problems On Random Vectors and Joint Distributions"). 
f XY (t, u) = Ys(t+ 2m), for < t < 2, < u < min{2t, 3 - £}. 

fx (t) = l m (t) ^t 2 + J {1 , 2] (t) A (3 - t ) (11.142) 

h («) = V] («) (^ + ^ - 1« 2 ) + ^, 2] («) (rs + rz u - 1" 2 ) ( n - 143 ) 

Exercise 11.32 (Solution on p. 340.) 

(See Exercise 22 (Exercise 8.22) from "Problems On Random Vectors and Joint Distributions"). 

fXY (t,u) = I [0 ,i] (t) | (t 2 + 2m) + 1(1,2] (t) ^t 2 u 2 , 
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for < u < 1. 
fx (t) = J [0 ,i] (i) | (i 2 + 1) + /(i,2] (t) l~/ 



M«) 



< u < 1 



(11.144) 



Exercise 11.33 (Solution on p. 340.) 

The class {X, Y, Z} of random variables is iid (independent, identically distributed) with common 
distribution 



X = [-5-13 4 7] PI = 0.01 * [15 20 30 25 10] (11.145) 

Let W = 2>X — AY + 2Z. Determine i?[W]. Do this using icalc, then repeat with icalc3 and 
compare results. 

Exercise 11.34 (Solution on p. 341.) 

(See Exercise 5 (Exercise 10.5) from "Problems on Functions of Random Variables") The cultural 
committee of a student organization has arranged a special deal for tickets to a concert. The 
agreement is that the organization will purchase ten tickets at $20 each (regardless of the number 
of individual buyers). Additional tickets are available according to the following schedule: 

11-20, $18 each; 21-30 $16 each; 31-50, $15 each; 51-100, $13 each 

If the number of purchasers is a random variable X, the total cost (in dollars) is a random 
quantity Z = g (X) described by 



g (X) = 200 + 18/mi (X) {X - 10) + (16 - 18) I M2 {X) {X - 20) + 



(11.146) 



(15 - 16) I M3 (X) (X - 30) + (13 - 15) I M A (X) (X - 50) 



(11.147) 



where Ml = [10, oo) , Ml = [20, oo) , M3 = [30, oo) , M4 = [50, oo) (11.148) 

Suppose X ~ Poisson (75). Approximate the Poisson distribution by truncating at 150. Determine 

E [Z] and E [Z 2 ] . 

Exercise 11.35 (Solution on p. 341.) 

The pair {X, Y} has the joint distribution (in m-file npr08_07.m (Section 17.8.38: npr08_07)): 



P(X = t, Y = u) 



(11.149) 



t = 


-3.1 


-0.5 


1.2 


2.4 


3.7 


4.9 


u = 7.5 


0.0090 


0.0396 


0.0594 


0.0216 


0.0440 


0.0203 


4.1 


0.0495 





0.1089 


0.0528 


0.0363 


0.0231 


-2.0 


0.0405 


0.1320 


0.0891 


0.0324 


0.0297 


0.0189 


-3.8 


0.0510 


0.0484 


0.0726 


0.0132 





0.0077 



Table 11.4 



Let Z = g {X, Y) = 3X 2 + 2XY - Y 2 . Determine E [Z] and E [Z 2 ] . 

Exercise 11.36 

For the pair {X, Y} in Exercise 11.35, let 



(Solution on p. 342.) 



W = gX,Y={ 



X 
TY 



for X 
for X 



Y < 4 

Y >4 



I M (X, Y)X + I M c (X, Y) 2Y 



(11.150) 
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Determine E [W] and E [W 2 ] . 
For the distributions in Exercises 37-41 below 

a. Determine analytically E [Z] and E [Z 2 ~\ . 

b. Use a discrete approximation to calculate the same quantities. 

Exercise 11.37 

f XY (t, u) = ^ (2t + 3u 2 ) for < t < 2, < u < 1 + t (see Exercise 11.25). 

Z = I m (X)AX + I {li2] (X)(X + Y) 



(Solution on p. 342.) 



(11.151) 



Exercise 11.38 



(Solution on p. 342.) 



f XY (t,u) = fftu for < t < 2, < u < min{l,2- t} (see Exercise 11.27). 

Z = I M (X,Y) l -X + I M a {XX) Y 2 , M = {(*,«) : u > t} 



(11.152) 



Exercise 11.39 



(Solution on p. 342.) 



f XY (t, u) = ^ (t + 2m) for < t < 2, < u < max{2 - t, t} (see Exercise 11.28). 

Z = I M {X, Y) {X + Y) + I M . {X, Y) 2Y, M = {(t, u) : max (t, u) < 1} 



(11.153) 



Exercise 11.40 (Solution on p. 343.) 

f XY (t,u) = y=| (3i 2 + u), for < t < 2, < u < mw{2,3- t} (see Exercise 11.29). 



Z = I M (X,Y)(X + Y) + I M o {X,Y)2Y 2 , M = {(t,u) : t < 1, u > 1} 



(11.154) 



Exercise 11.41 



(Solution on p. 343.) 



f XY (t, u) = ^ (3i + 2tu), for < t < 2, < u < min{\ + t, 2} (see Exercise 11.30). 

Z = I M (X, Y) X + I M o (X, Y) XY, M = {{t,u) :u< min {1,2- t)} (11.155) 

Exercise 11.42 (Solution on p. 344.) 

The class {X, Y, Z} is independent. (See Exercise 16 (Exercise 10.16) from "Problems on Functions 
of Random Variables", m-file nprl0_16.m (Section 17.8.42: nprl0_16)) 

X = —21 a + Ib + 3/c- Minterm probabilities are (in the usual order) 



0.255 0.025 0.375 0.045 0.108 0.012 0.162 0.018 
Y = Id + 3/e + Ip — 3. The class {D, E, F} is independent with 



P (D) = 0.32 P (E) = 0.56 P {F) = 0.40 



Z has distribution 



(11.156) 
(11.157) 



Value 


-1.3 


1.2 


2.7 


3.4 


5.8 


Probability 


0.12 


0.24 


0.43 


0.13 


0.08 



Table 11.5 

W = X 2 + 3XY 2 - 3Z . Determine E [W] and E [W 2 ] 
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Solutions to Exercises in Chapter 11 

Solution to Exercise 11.1 (p. 326) 

'/. file npr07_01.m (Section~17 .8 . 30: npr07_01) 
'/, Data for Exercise 1 (Exercise~7 . 1) from "Problems on Distribution and Density Functions" 
T =[132342135 2]; 
pc = 0.01*[ 8 13 6 9 14 11 12 7 11 9]; 
disp('Data are in T and pc') 
npr07_01 

Data are in T and pc 
EX = T*pc' 
EX = 2.7000 

[X,PX] = csort(T,pc); '/. Alternate using X, PX 
ex = X+PX' 
ex = 2.7000 

Solution to Exercise 11.2 (p. 326) 

'/. file npr07_02.m (Section~17 .8 . 31 : npr07_02) 
'/, Data for Exercise 2 (Exercise~7 .2) from "Problems on Distribution and Density Functions" 
T = [3.5 5.0 3.5 7.5 5.0 5.0 3.5 7.5]; 
pc = 0.01* [10 15 15 20 10 5 10 15]; 
disp('Data are in T, pc') 
npr07_02 

Data are in T, pc 
EX = T*pc' 
EX = 5.3500 
[X,PX] = csort(T,pc) ; 
ex = X+PX' 
ex = 5.3500 

Solution to Exercise 11.3 (p. 326) 

'/. file npr06_12.m (Section~17 .8 . 28: npr06_12) 
'/, Data for Exercise 12 (Exercise~6. 12) from "Problems on Random Variables and Probabilities" 
pm = 0.001* [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302]; 
c = [1 1 1 1 0] ; 

disp(' Minterm probabilities in pm, coefficients in c') 
npr06_12 

Minterm probabilities in pm, coefficients in c 
canonic 

Enter row vector of coefficients c 

Enter row vector of minterm probabilities pm 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 
EX = X*PX' 
EX = 2.9890 
T = sum(mintable(4)) ; 
[x,px] = csort(T,pm); 
ex = x*px 
ex = 2.9890 
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Solution to Exercise 11.4 (p. 326) 

X ~ binomial (127, 0.0083). E [X] = 127 • 0.0083 = 1.0541 
Solution to Exercise 11.5 (p. 326) 

X ~ binomial (20, 1/2). E [X] = 20 • 0.5 = 10. 
Solution to Exercise 11.6 (p. 327) 

N ~ binomial (50, 0.2). E [N] = 50 • 0.2 = 10. E [X] = 500 - 30S [N] = 200. 
Solution to Exercise 11.7 (p. 327) 

X ~ Poisson (7). E[X] = 7. 
Solution to Exercise 11.8 (p. 327) 

X ~ gamma (20, 0.0002). E[X] = 20/0.0002 = 100,000. 
Solution to Exercise 11.9 (p. 327) 

'1 a z-2 



E[X} = J tfx (t) <ft = | y t3dt +lJ ( 2t - t2 ) dt= ^ (11.158) 

Solution to Exercise 11.10 (p. 327) 

/ t\e~ xt dt + aP (X > a) = (11.159) 

Jo 



E[Y}= g(t)f x (t)dt 



* [1 _ e - Aa (1 + Ao)] + ae- Aa = i 
A A 



1 - e~ Aa (1 + Aa)] + ae- Aa = - (l - £ - Aa ) (11.160) 



tappr 
Enter matrix [a b] of x-range end/points [0 1000] 
Enter number of x approximation points 10000 
Enter density as a function of t (1/50) *exp(-t/50) 
Use row matrices X and PX as in the simple case 
G = X.*(X<=30) + 30*(X>30); 
EZ = G8PX' 
EZ = 22.5594 

ez = 50*(1 - exp(-30/50)) '/. Theoretical value 

ez = 22.5594 

Solution to Exercise 11.11 (p. 327) 

npr08_01 (Section~17. 8 . 32: npr08_01) 
Data in Pn, P, X, Y 
jcalc 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
EX = X*PX' 
EX = 0.1538 

ex = total (t.*P) '/. Alternate 

ex = 0.1538 

EY = Y*PY' 

EY = 0.5000 

EX2 = (X.~2)*PX' 

EX2 = 0.1629 
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EY2 = (Y.~2)*PY' 
EY2 = 0.6176 
EXY = total (t.*u.*P) 
EXY = 0.0769 

Solution to Exercise 11.12 (p. 327) 



npr08_02 (Section~17. 8 . 33: npr08_02) 
Data are in X, Y,Pn, P 
jcalc 



EX = X*PX' 

EX = 0.5000 

EY = Y*PY' 

EY = 0.7500 

EX2 = (X.~2)*PX' 

EX2 = 0.5714 

EY2 = (Y.~2)*PY' 

EY2 = 0.9643 

EXY = total (t.*u.*P) 

EXY = 0.2143 

Solution to Exercise 11.13 (p. 328) 



npr08_03 (Section~17. 8 . 34: npr08_03) 
Answers are in X, Y, P, PY 
jcalc 



EX = X*PX' 

EX = 3.5000 

EY = Y*PY' 

EY = 1.7500 

EX2 = (X.~2)*PX' 

EX2 = 15.1667 

EY2 = (Y.~2)*PY' 

EY2 = 4.6667 

EXY = total (t.*u.*P) 

EXY = 7.5833 

Solution to Exercise 11.14 (p. 328) 



npr08_04 (Exercise~8.4) 
Answers are in X, Y, P 
jcalc 



EX = 


X*PX 


5 


EX = 


7 




EY = 


Y+PY 


) 


EY = 


3. 


5000 


EX2 ■ 


= (X. 


~2)*PX 


EX2 = 


= 54. 


8333 
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EY2 = (Y.~2)*PY' 
EY2 = 15.4583 

Solution to Exercise 11.15 (p. 328) 



npr08_05 (Section~17. 8 . 36: npr08_05) 
Answers are in X, Y, P, PY 
jcalc 



EX = X+PX' 
EX = 7.0000 
EY = Y+PY' 
EY = 1.1667 

Solution to Exercise 11.16 (p. 328) 

npr08_06 (Section~17. 8 . 37: npr08_06) 
Data are in X, Y, P 
jcalc 



EX = X+PX' 

EX = 1.3696 

EY = Y+PY' 

EY = 3.0344 

EX2 = (X.~2)*PX' 

EX2 = 9.7644 

EY2 = (Y.~2)*PY' 

EY2 = 11.4839 

EXY = total (t.*u.*P) 

EXY = 4.1423 

Solution to Exercise 11.17 (p. 328) 



npr08_07 (Section~17. 8 . 38: npr08_07) 
Data are in X, Y, P 
jcalc 



EX = X+PX' 

EX = 0.8590 

EY = Y+PY' 

EY = 1.1455 

EX2 = (X.~2)*PX' 

EX2 = 5.8495 

EY2 = (Y.~2)*PY' 

EY2 = 19.6115 

EXY = total (t.+u.+P) 

EXY = 3.6803 

Solution to Exercise 11.18 (p. 329) 
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npr08_08 (Section~17. 8 . 39: npr08_08) 
Data are in X, Y, P 
jcalc 



EX = X*PX' 

EX = 10.1000 

EY = Y*PY' 

EY = 3.0016 

EX2 = (X.~2)*PX' 

EX2 = 133.0800 

EY2 = (Y.~2)*PY' 

EY2 = 41.5564 

EXY = total (t.*u.*P) 

EXY = 22.2890 

Solution to Exercise 11.19 (p. 329) 



npr08_09 (Section~17. 8 .40: npr08_09) 
Data are in X, Y, P 



jcalc 



EX = X*PX' 

EX = 1.9250 

EY = Y+PY' 

EY = 2.8050 

EX2 = (X.~2)*PX' 

EX2 = 4.0375 

EY2 = (Y.~2)*PY' EXY = total (t . *u. *P) 

EY2 = 8.9850 EXY = 5.1410 

Solution to Exercise 11.20 (p. 330) 



E[X]= J 2t (1 - t) dt = 1/3, E[Y} = 2/3, E [X 2 ] = 1/6, E [Y 2 ] = 2/3 (11.161) 

Jo 

,1 i-2(l-i) 

E[XY}= / tududt= 1/6 (11.162) 

Jo Jo 

tuappr: [0 1] [0 2] 200 400 u<=2*(l-t) 
EX = 0.3333 EY = 0.6667 EX2 = 0.1667 EY2 = 0.6667 
EXY = 0.1667 (use t, u, P) 

Solution to Exercise 11.21 (p. 330) 

E [X] = E [Y] = J t 2 dt+ J (2t- t 2 ) dt=l, E [X 2 ] = E [Y 2 ] = 7/6 (11.163) 

Jo Ji 

,■1 ,-l+t ,-2 ,-3-t 

E [XY] = (1/2) / / dudt+ (1/2) / dudt = l (11.164) 

Jo Ji-t J\ Jt-i 

tuappr: [0 2] [0 2] 200 200 0. 5* (u<=min(t+l ,3-t) )&(u>= max(l-t ,t-l) ) 
EX = 1.0000 EY = 1.0002 EX2 = 1.1684 EY2 = 1.1687 EXY = 1.0002 
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Solution to Exercise 11.22 (p. 330) 

E[X] = 2/3, E[Y] = 1/3, E[X 2 }=l/2, E [Y 2 ] = 1/6 E [XY] = 2/9 (11.165) 

tuappr: [0 1] [0 1] 200 200 4*t.*(l-u) 
EX = 0.6667 EY = 0.3333 EX2 = 0.5000 EY2 = 0.1667 EXY = 0.2222 

Solution to Exercise 11.23 (p. 330) 

E [X] = E [Y] = \ J (t 2 + t) dt=\, E [X 2 ] = E [Y 2 ] = 5/3 (11.166) 

E[XY] = I I I (t 2 u + tu 2 ) dudt= \ (11.167) 

8 Jo Jo 3 



tuappr: [0 2] [0 2] 200 200 (l/8)*(t+u) 
EX = 1.1667 EY = 1.1667 EX2 = 1.6667 EY2 = 1.6667 EXY = 1.3333 

Solution to Exercise 11.24 (p. 330) 

f'°° 19 11 1 

E[X\ = J 2te- 2t dt=-, E[Y} = -, E [X 2 ] = -, E [Y 2 ] = -, E[XY} = - (11.168) 

tuappr: [0 6] [0 1] 600 200 4*u. *exp(-2*t) 
EX = 0.5000 EY = 0.6667 EX2 = 0.4998 EY2 = 0.5000 EXY = 0.3333 

Solution to Exercise 11.25 (p. 330) 

E[X] = 3 A E[Y]= 1 -^, E[X 2 ] = 4 1, E[Y 2 ]= 1 ™, E [XY] = ^ (11.169) 

1 J 220 l J 880 L J 22 L J 55 l J 880 v ; 

tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t + 3*u. ~2) . * (u<l+t) 
EX = 1.4229 EY = 1.6202 EX2 = 2.2277 EY2 = 3.1141 EXY = 2.4415 

Solution to Exercise 11.26 (p. 330) 

E\X]= 2 l , E[Y} = ^, E[X 2 } = 2 1 , E[Y 2 } = ^, E\XY]= 2 - (11.170) 

tuappr: [-1 1] [0 1] 400 200 12*t . ~2 . *u. * (u>= max(O.t) ) . * (u<= min(l+t,l)) 
EX = 0.4035 EY = 0.7342 EX2 = 0.4016 EY2 = 0.6009 EXY = 0.4021 



Solution to Exercise 11.27 (p. 331) 



E[X] = f 5 , E[Y] = f 5 , E[X 2 ] = b l, E[Y 2 ] = \, *[*Y] = § (11.171) 



tuappr: [0 2] [0 1] 400 200 (24/11) *t . *u. * (u<=min(l ,2-t) ) 
EX = 0.9458 EY = 0.5822 EX2 = 1.0368 EY2 = 0.4004 EXY = 0.5098 
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Solution to Exercise 11.28 (p. 331) 

E\X\ - §, Em - I E [X^] = |I, E M = » E { XY\ = |1 (!,.,„, 

tuappr: [0 2] [0 2] 200 200 (3/23) *(t + 2*u) . * (u<=max(2-t ,t) ) 
EX = 1.1518 EY = 0.9596 EX2 = 1.7251 EY2 = 1.1417 EXY = 1.0944 

Solution to Exercise 11.29 (p. 331) 

™-*£- ^=w, *m=w- *M-i£. ^-^ <"-» 

tuappr: [0 2] [0 2] 400 400 (12/179) * (3*t . ~2 + u) . * (u<=min(2,3-t)) 
EX = 1.2923 EY = 0.8695 EX2 = 1.9119 EY2 = 1.0239 EXY = 1.0122 

Solution to Exercise 11.30 (p. 331) 

E[X]= 1 -^, E { Y]=—, E[X*] = ^ E[Yi] = ™, E [ XY} = 5 ^ (11.174) 

L J 1135 l J 2270 L J 227 L J 1135 l J 3405 v ' 

tuappr: [0 2] [0 2] 400 400 (12/227) * (3*t + 2*t . *u) . *(u<=min(l+t ,2) ) 
EX = 1.3805 EY = 1.0974 EX2 = 2.0967 EY2 = 1.5120 EXY = 1.5450 

Solution to Exercise 11.31 (p. 331) 

E\X] = » Em = 11, B[X*\ = ™ B[r\ = |. E { XY\ = |1 (1,.,7») 

tuappr: [0 2] [0 2] 400 400 (2/13)*(t + 2*u) . * (u<=min(2*t ,3-t) ) 
EX = 1.2309 EY = 0.9169 EX2 = 1.6849 EY2 = 1.0647 EXY = 1.1056 

Solution to Exercise 11.32 (p. 331) 

E[X} = —, E[Y} = —, E \X 2 ] = —, E \Y 2 ] = —, E [XY] = — (11.176) 

L J 224 ' L J 16' L J 70 L J 240' L J 448 y ' 

tuappr [0 2] [0 1] 400 200 (3/8) * (t . ~2+2*u) . *(t<=l) + (9/14) * (t . ~2 . *u. ~2) . * (t > 1) 
EX = 1.0848 EY = 0.6875 EX2 = 1.5286 EY2 = 0.5292 EXY = 0.7745 

Solution to Exercise 11.33 (p. 332) 

Use x and px to prevent renaming. 

x = [-5-13 4 7]; 
px = 0.01* [15 20 30 25 10] ; 
icalc 

Enter row matrix of X-values x 
Enter row matrix of Y-values x 
Enter X probabilities px 
Enter Y probabilities px 

Use array operations on matrices X, Y, PX, PY, t, u, and P 

G = 3*t -4*u; 
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[R,PR] = csort(G,P) ; 

icalc 
Enter row matrix of X-values R 
Enter row matrix of Y-values x 
Enter X probabilities PR 
Enter Y probabilities px 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
H = t + 2*u; 



EH = total(H.*P) 




EH = 1.6500 




[W,PW] = csort(H,P); 


'/, Alterna 


EW = W*PW 




EW = 1.6500 




icalc3 


'/. Solutio: 


Enter row matrix of X- 


-values x 


Enter row matrix of Y- 


-values x 


Enter row matrix of Z- 


-values x 


Enter X probabilities 


px 


Enter Y probabilities 


px 



Enter Z probabilities px 

Use array operations on matrices X, Y, Z, 

PX, PY, PZ, t, u, v, and P 

K = 3*t - 4*u + 2*v; 

EK = total(K.*P) 

EK = 1.6500 

Solution to Exercise 11.34 (p. 332) 



X = 0:150; 
PX = ipoisson(75,X) ; 

G = 200 + 18* (X - 10).*(X>=10) + (16 - 18)*(X - 20).*(X>=20) + 
(15 - 16)*(X- 30).*(X>=30) + (13 - 15)*(X - 50) . * (X>=50) ; 
[Z,PZ] = csort(G,PX); 
EZ = Z+PZ' 
EZ = 1.1650e+03 
EZ2 = (Z.~2)*PZ' 
EZ2 = 1.3699e+06 

Solution to Exercise 11.35 (p. 332) 

npr08_07 (Section~17. 8 . 38: npr08_07) 
Data are in X, Y, P 
jcalc 



G = 3*t.~2 + 2*t.*u - u.~2; 

EG = total (G.*P) 

EG = 5.2975 

ez2 = total(G.~2.*P) 

EG2 = 1.0868e+03 

[Z,PZ] = csort(G,P); '/. Alternate 

EZ = Z+PZ' 
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EZ = 5.2975 
EZ2 = (Z.~2)*PZ' 
EZ2 = 1.0868e+03 

Solution to Exercise 11.36 (p. 332) 



H = t . * (t+u<=4) + 2*u.*(t+u>4) ; 
EH = total (H.*P) 
EH = 4.7379 
EH2 = total(H.~2.*P) 
EH2 = 61.4351 

[W,PW] = csort(H,P); '/. Alternate 
EW = W*PW 
EW = 4.7379 
EW2 = (W.~2)*PW 
EW2 = 61.4351 

Solution to Exercise 11.37 (p. 333) 

E[Z] = — / At (2t + 3m 2 ) dudt + 
88 J a J a 



2 r i+t 



E[Z 2 } 



1 rl+t 



L JO 

2 r l+t 



(At) 2 (2f + 3u 2 ) dudt + - , 
-'o 88 J\ Jo 



(t + u) (2t + 3m 2 ) dudt = 
[t + uf (2t + 3w 2 ) dudt 



5649 
1760 

4881 



440 



(11.177) 
(11.178) 



tuappr: [0 2] [0 3] 200 300 (3/88) * (2*t+3*u. ~2) . * (u<=l+t) 
G = 4*t.*(t<=l) + (t + u).*(t>l); 
EG = total (G.*P) 
EG = 3.2086 
EG2 = total(G.~2.*P) 
EG2 = 11.0872 

Solution to Exercise 11.38 (p. 333) 

~i r\ /* 1 p 1 r\A /"l ft r\ a /*2 /*2 — t -i /? 

E\Z} = — / t 2 u dudt H / / tu 3 dudt + — tu 3 dudt = — 

11 Jo Jt 11 Jo Jo 11 A Jo 55 



6 



£7 \Z 2 } = - 
1 J 11 



t 3 u dudt + 



24 
11 



tu dudt 



24 
11 



tu 5 dudt 



39 
308 



(11.179) 
(11.180) 



tuappr: [0 2] [0 1] 400 200 (24/11) *t . *u. *(u<=min(l ,2-t) ) 
G = (l/2)*t.*(u>t) + u.~2.*(u<=t) ; 
EZ = 0.2920 EZ2 = 0.1278 

Solution to Exercise 11.39 (p. 333) 



E [Z] = ^ / / (t + u)(t + 2u) dudt + ^ / /j 2u(t + 2u) dudt + (11.181) 



23 



/j | x 2m (i + 2m) dwdt 



92 
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E [Z 2 ] = ^ Jg 1 jl (t + uf (t + 2u) dudt + ^ Jg 1 /*"* 4w 2 (i + 2u) dudt + (11.182) 
IJ 2 J^Au 2 (t + 2u) dudt = 2 -^§ 

tuappr: [0 2] [0 2] 400 400 (3/23) * (t+2*u) . * (u<=max(2-t ,t) ) 
M = max(t,u)<=l; 
G = (t+u).*M + 2*u.*(l-M); 
EZ = total(G.*P) 
EZ = 1.9048 
EZ2 = total(G.~2.*P) 
EZ2 = 4.4963 

Solution to Exercise 11.40 (p. 333) 

12 f' 1 f' 2 12 f' 1 f' 1 

E[Z] = / / (t + u) (3t 2 + u) dudt + / / 2w 2 (3t 2 + u) dudt+ (11.183) 

179 J Ji 179 Jo Jo 

12 I' 2 l' 3 ^ 1 1422 

/ / 2u 2 (3t 2 + u) dudt = (11.184) 

179 J 1 Jo 895 

12 t 1 f 2 12 f 1 t 1 

E[Z 2 ]=—\ / {t + u) 2 (3£ 2 + u) dudt+ — \ / 4u 4 (3t 2 + u) dudt+ (11.185) 

179 Jo h 179 Jo Jo 

12 f 2 f 3 '* . , o N 28296 

— JJ o ^(3t 2 + u)dudt= — (11.186) 

tuappr: [0 2] [0 2] 400 400 (12/179) * (3*t . "2 + u) . * (u <= min(2,3-t)) 
M = (t<=l)&(u>=l); 
G = (t + u).*M + 2*u.~2.*(l - M) ; 
EZ = total(G.*P) 
EZ = 1.5898 
EZ2 = total(G.~2.*P) 
EZ2 = 4.5224 

Solution to Exercise 11.41 (p. 333) 

12 r 1 r 1 12 r 2 r 2 ~ l 

E[Z] = — / t [St + 2tu) dudt + — / t {St + 2tu) dudt + (11.187) 

227 Jo Jo 227 J^ J 

12 f 1 f 1+t , N 12 f 2 f 2 , N 5774 



, , tu (St + 2tu) dudt H / / tu (St + 2tu) dudt = (11.188) 

227 J Jj 227 J 1 J 2 _ t 3405 

r on 56673 

E \Z 2 ] = (11.189) 

L J 15890 V ; 



tuappr: [0 2] [0 2] 400 400 (12/227) * (3*t + 2*t.*u).*(u <= min(l+t,2)) 
M = u <= min(l,2-t) ; 
G = t.*M + t.*u.*(l - M) ; 
EZ = total(G.*P) 
EZ = 1.6955 
EZ2 = total(G.~2.*P) 
EZ2 = 3.5659 
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Solution to Exercise 11.42 (p. 333) 

nprl0_16 (Section~17.8.42: nprl0_16) 
Data are in ex, pmx, cy, pmy, Z, PZ 
[X,PX] = canonicf (ex, pmx) ; 
[Y,PY] = canonicf (cy, pmy) ; 
icalc3 
input: X, Y, Z, PX, PY, PZ 



Use array operations on matrices X, Y, Z, 

PX, PY, PZ, t, u, v, and P 

G = t.~2 + 3*t.*u.~2 - 3*v; 

[W,PW] = csort(G,P); 

EW = W*PW 

EW = -1.8673 

EW2 = (W.~2)*PW 

EW2 = 426.8529 



Chapter 12 

Variance, Covariance, Linear Regression 



12.1 Variance 1 

In the treatment of the mathematical expection of a real random variable X, we note that the mean value 
locates the center of the probability mass distribution induced by X on the real line. In this unit, we examine 
how expectation may be used for further characterization of the distribution for X. In particular, we deal 
with the concept of variance and its square root the standard deviation. In subsequent units, we show 
how it may be used to characterize the distribution for a pair {X, Y} considered jointly with the concepts 
covariance, and linear regression 

12.1.1 Variance 

Location of the center of mass for a distribution is important, but provides limited information. Two markedly 
different random variables may have the same mean value. It would be helpful to have a measure of the 
spread of the probability mass about the mean. Among the possibilities, the variance and its square root, 
the standard deviation, have been found particularly useful. 

Definition. The variance of a random variable X is the mean square of its variation about the mean 
value: 

Var [X] = o x = E Ux - ^ x f\ where (j, x = E [X] (12.1) 

The standard deviation for X is the positive square root ax of the variance. 
Remarks 

• If X (u>) is the observed value of X, its variation from the mean is X (u>) — [ix- The variance is the 
probability weighted average of the square of these variations. 

• The square of the error treats positive and negative variations alike, and it weights large variations 
more heavily than smaller ones. 

• As in the case of mean value, the variance is a property of the distribution, rather than of the random 
variable. 

• We show below that the standard deviation is a "natural" measure of the variation from the mean. 

• In the treatment of mathematical expectation, we show that 

E \{X - cf] is a minimum iff c = E[X], in which case e\(X-E \X]f] = E [X 2 ] - E 2 [X] 

(12.2) 
This shows that the mean value is the constant which best approximates the random variable, in the 
mean square sense. 



1 This content is available online at <http://cnx.Org/content/m23441/l.6/>. 
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Basic patterns for variance 

Since variance is the expectation of a function of the random variable X, we utilize properties of expec- 
tation in computations. In addition, we find it expedient to identify several patterns for variance which are 

frequently useful in performing calculations. For one thing, while the variance is defined as E \(X — fix) , 
this is usually not the most convenient form for computation. The result quoted above gives an alternate 
expression. 

(VI): Calculating formula. Var [X] = E [X 2 ] - E 2 [X]. 

(V2): Shift property. Var [X + b] = Var [X]. Adding a constant b to X shifts the distribution (hence its 

center of mass) by that amount. The variation of the shifted distribution about the shifted center of 

mass is the same as the variation of the original, unshifted distribution about the original center of 

mass. 
(V3): Change of scale. Var [aX] = a 2 Var [X]. Multiplication of X by constant a changes the scale by a 

factor |o|. The squares of the variations are multiplied by a 2 . So also is the mean of the squares of the 

variations. 
(V4): Linear combinations 

a. Var [aX ± bY] = a 2 Var \X] + 6 2 Var [Y] ± 2ab (E \XY] - E[X]E [Y]) 

b. More generally, 



Var 



2 , a kXk 

.fc=l 



I! 



J2 afcVar \X k ] + 2 ^ a.a, (E [X.X,] - E \X t ] E [Xj]) (12.3) 

fe=l i<j 



The term cy = E [XiXj] — E \Xj\ E [Xj] is the covariance of the pair {Xi, Xj}, whose role we study 
in the unit on that topic. If the Cij are all zero, we say the class is uncorrected. 



Remarks 



If the pair {X, Y} is independent, it is uncorrelated. The converse is not true, as examples in the next 

section show. 

If the <2j = ±1 and all pairs are uncorrelated, then 



Var 



y^ ajXj 
fe=i 



^Var[V 4 ] (12.4) 



fe=i 



The variance add even if the coefficients are negative. 

We calculate variances for some common distributions. Some details are omitted — usually details of algebraic 
manipulation or the straightforward evaluation of integrals. In some cases we use well known sums of 
infinite series or values of definite integrals. A number of pertinent facts are summarized in Appendix 
B (Section 17.2). Some Mathematical Aids. The results below are included in the table in Appendix C 
(Section 17.3). 

Variances of some discrete distributions 

1. Indicator function X = IeP (E) = p 7 q = 1 — p E [X] = p 

E [X 2 ] - E 2 [X] = E [1% } -p 2 = E[I E }-p 2 =p-p 2 =p(l-p)=pq (12.5) 

2. Simple random variableX = X^ILi U^Ai (primitive form) P (Ai) = p^. 

n 

Var [X] = Y, t 2 l p l q l - 2 ^ UtjPiPj, since£ [IaJa,] = Oi ^ j (12.6) 

i— 1 i<j 
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3. Binomial(n,p). X = J2"=i ^E i with{/E j : 1 < i < n}ndP (Ej) = p 

n n 

Var [X] = Y^ Var [I E .] = ^pq = npq (12.7) 

»=i i=i 

4. Geometric{p). P (X = k) = pq k Mk > OE [X] = q/p 
We use a trick: E [X 2 ] = E [X (X - 1)] + E [X] 

oo oo _ 2 

^ [**] = pY, k ( k - !) Q k +<l/P = Pq 2 Y, k ( k ~ !) ^ 2 + ?/P = PI 2 : 73+l/P = 2 h +( l/P ( 12 -8) 

Var [X] = 2% + g/p - (g/p) 2 = q/p 2 (12.9) 

p z 

5. Poisson(n)P (X = k) = e~^ ^Vfc > 

Using E [X 2 ] =E[X(X-1)] + E [X], we have 

£[X 2 ]= e ^f>(fc-l)^ + M = e -VE7£^ + ^ = M 2 + M (12-10) 

fc=2 '' fc=2 ^ '' 

Thus, Var [X] = p 2 + p — p 2 = p. Note that both the mean and the variance have common value p. 
Some absolutely continuous distributions 

1. Uniform on (a, 6)/ x (t) = ^a < t < bE \X] = ^ 

1 /" b 2 , &3 -« 3 „„ lr [v1 6 3 -a 3 (a + 6) 2 (6 - a) 2 



£ [X 2 1 = / t 2 dt = — -soVar \X] = — - = (12.11) 

L J b-aj a 3 (6 -a) L J 3 (b - a) 4 12 v ; 

2. Symmetric triangular (a, b) Because of the shift property (V2) ("(V2)", p. 346), we may center the 
distribution at the origin. Then the distribution is symmetric triangular (— c, c), where c = (b — a) /2. 
Because of the symmetry 

Var [X] = E [X 2 ] = f t 2 f x (t) dt = 2 f t 2 f x (t) dt (12.12) 

J-c Jo 

Now, in this case, 

c— t 2 l' c c 2 (b - a) 2 

f x (t) = -— <t<cso thatS \X 2 } = -= / (ct 2 -t 3 )dt= — = - — (12.13) 

c 2 c z J 6 24 

3. Exponential (A) f x (t) = \er xt ,t > OE [X] = 1/A 

/•oo 2 

E [X 2 ] = / Xt 2 e- Xt dt = — sothatVar [X] = 1/A 2 (12.14) 

Jo * 

4. Gamma(a,X)f x (t) = T ^X a t a - 1 er xt t > 0E[X] = f 



^-rfef*-'*"--"*-^-^ 



Hence Var [X] = a/A 2 
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5. Normal(n,a 2 )E[X} = /j, 

Consider Y~N(0,1),E [Y] = 0, Var [Y] = -^= J °° t 2 e- e l 2 dt = 1. 

X = aY + ^impliesVar [X] = cr 2 Vav [Y] = a 2 (12.16) 

Extensions of some previous examples 

In the unit on expectations, we calculate the mean for a variety of cases. We revisit some of those 
examples and calculate the variances. 

Example 12.1: Expected winnings (Example 8 (Example 11.8: Expected winnings) 
from "Mathematical Expectation: Simple Random Variables") 

A bettor places three bets at $2.00 each. The first pays $10.00 with probability 0.15, the second 
$8.00 with probability 0.20, and the third $20.00 with probability 0.10. 

SOLUTION 

The net gain may be expressed 

X = 10I A + M B + 20/ c - 6, with P (A) = 0.15, P (B) = 0.20, P (C) = 0.10 (12.17) 

We may reasonbly suppose the class {A, B, C} is independent (this assumption is not necessary 
in computing the mean) . Then 

Var [X] = W 2 P (A) [1 - P {A)} + 8 2 P (B) [1 - P (B)] + 20 2 P (C) [1 - P (C)] (12.18) 

Calculation is straightforward. We may use MATLAB to perform the arithmetic. 

c = [10 8 20] ; 
p = 0.01*[15 20 10]; 
q = 1 - p; 

VX = sum(c.~2.*p.*q) 
VX = 58.9900 

Example 12.2: A function of X (Example 9 (Example 11.9: Expectation of a function 
of X) from "Mathematical Expectation: Simple Random Variables") 

Suppose X in a primitive form is 

X = -3I Cl - Ic 2 + 2I C3 - 3I Ci + 4/c 6 - Ic 6 + Ic r + 2Ic s + 3/ c , + 2I Cl0 (12.19) 

with probabilities P (d) = 0.08, 0.11, 0.06, 0.13, 0.05, 0.08, 0.12, 0.07, 0.14, 0.16. 
Let g (t) = t 2 + It. Determine E [g (X)} and Var [g (X)} 

c = [-3-12-34-11232]; '/. Original coefficients 

pc = 0.01* [8 11 6 13 5 8 12 7 14 16]; '/, Probabilities for C_j 

G = c.~2 + 2*c '/, g(c_j) 

EG = G*pc' '/. Direct calculation E[g(X)] 

EG = 6.4200 

VG = (G.~2)*pc' - EG~2 '/. Direct calculation Var[g(X)] 

VG = 40.8036 

[Z,PZ] = csort(G,pc); '/„ Distribution for Z = g(X) 

EZ = Z+PZ' '/. E[Z] 

EZ = 6.4200 

VZ = (Z.~2)*PZ' - EZ~2 '/. Var[Z] 

VZ = 40.8036 
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Example 12.3: Z = g(X, Y) (Example 10 (Example 11.10: Expectation for Z = g {X, Y)) 
from "Mathematical Expectation: Simple Random Variables") 

We use the same joint distribution as for Example 10 (Example 11.10: Expectation for Z = 
g (X, Y)) from "Mathematical Expectation: Simple Random Variables" and let g (t, u) = t 2 + 
2tu — 3u. To set up for calculations, we use jcalc. 

jdemol °/ Call for data 

jcalc '/, Set up 

Enter JOINT PROBABILITIES (as on the plane) P 

Enter row matrix of VALUES of X X 

Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 

G = t.~2 + 2*t.*u - 3*u; '/, Calculation of matrix of [g(t_i, u_j)] 

EG = total(G.*P) '/„ Direct calculation of E[g(X,Y)] 
EG = 3.2529 

VG = total(G.~2.*P) - EG~2 '/. Direct calculation of Var[g(X,Y)] 
VG = 80.2133 

[Z,PZ] = csort(G,P); '/, Determination of distribution for Z 

EZ = Z+PZ' '/, E[Z] from distribution 
EZ = 3.2529 

VZ = (Z."2)*PZ' - EZ~2 '/. Var[Z] from distribution 
VZ = 80.2133 

Example 12.4: A function with compound definition (Example 12 (Example 11.22: 
A function with a compound definition: truncated exponential) from "Mathematical 
Expectation; General Random Variables") 

Suppose X ~ exponential (0.3). Let 

X 2 for X < 4 

Z = { ~ =I [0A] (X)X 2 + I^ oo] (X)W (12.20) 

16 for X > 4 

Determine E [Z] and Var [Z]. 
ANALYTIC SOLUTION 

E [g (X)} = J g(t) f x (t) dt = J J [M (t) i 2 0.3e^°- 3t dt + 16E [l {4>oo] (X)] (12.21) 

= / t 2 0.3e" 03t dt + 16P (X > 4) w 7.4972 (by Maple) (12.22) 

Jo 

Z 2 = l m (X) X 4 + / (4j0o] (X) 256 (12.23) 

/•oo /*4 

E [Z 2 ] = / I [0A] (t) t 4 0.3e- 0M dt + 256E [j (4i0o] (X)] = / i 4 0.3e-°' 3 * dt + 256e~ 12 w 100.0562 (12.24) 
Jo Jo 

Var [Z] = E [Z 2 ] - E 2 [Z] w 43.8486 (by Maple) (12.25) 

APPROXIMATION 

To obtain a simple aproximation, we must approximate by a bounded random variable. Since 
P {X > 50) = e" 15 w 3 • 10" 7 we may safely truncate X at 50. 
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tappr 
Enter matrix [a b] of x-range end/points [0 50] 
Enter number of x approximation points 1000 
Enter density as a function of t . 3*exp(-0 . 3*t) 
Use row matrices X and PX as in the simple case 
M = X <= 4; 

G = M.*X.~2 + 16*(1 - M); '/. g(X) 
EG = G*PX' '/. E[g(X)] 

EG = 7.4972 

VG = (G.~2)*PX' - EG-2 '/. Var[g(X)] 

VG = 43.8472 '/. Theoretical = 43.8486 

[Z,PZ] = csort(G,PX); '/. Distribution for Z = g(X) 

EZ = Z+PZ' '/. E[Z] from distribution 

EZ = 7.4972 

VZ = (Z.~2)*PZ' - EZ~2 '/. Var[Z] 

VZ = 43.8472 

Example 12.5: Stocking for random demand (Example 13 (Example 11.23: Stocking 
for random demand (see Exercise 4 (Exercise 10.4) from "Problems on Functions of 
Random Variables")) from "Mathematical Expectation; General Random Variables") 

The manager of a department store is planning for the holiday season. A certain item costs c 
dollars per unit and sells for p dollars per unit. If the demand exceeds the amount m ordered, 
additional units can be special ordered for s dollars per unit (s > c). If demand is less than the 
amount ordered, the remaining stock can be returned (or otherwise disposed of) at r dollars per 
unit (r < c). Demand D for the season is assumed to be a random variable with Poisson (/i) 
distribution. Suppose /i = 50, c = 30, p = 50, s = 40, r = 20. What amount m should the manager 
order to maximize the expected profit? 

PROBLEM FORMULATION 

Suppose D is the demand and X is the profit. Then 

For D < m, X = D (p — c) — (m — D) (c — r) = D (p — r) + m (r — c) 
For D > m, X = m (p — c) + (D — m) (p — s) = D (p — s) + m (s — c) 

It is convenient to write the expression for X in terms of Im, where M = (— oo,m]. Thus 

X = I M (D) [D{p-r) + m{r- c)] + [1 - I M (D)] [D (p - s) + m(s - c)] (12.26) 

= D {p - s) + m {s - c) + I M (D) [D {p - r) + m (r - c) - D {p - s) - m {s - c)] (12.27) 

= D(p- s) + m(s-c) + I M (D) (s -r)[D- m] (12.28) 



Then 



E [X] = (p - s) E [D] + m(s - c) + {s - r) E [I M (D) D] - {s - r) mE [I M (£>)] (12.29) 



We use the discrete approximation. 
APPROXIMATION 

> mu = 50; 

> n = 100; 

> t = 0:n; 

3> pD = ipoisson(mu,t) ; '/, Approximate distribution for D 
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> 


c = 30 






> 


p = 50 






> 


s = 40 






> 


r = 20 






> 


m = 45 


55; 




> 


for i = 


1 : length (m) 






M = t< 


=m(i) ; 






G(i,:) 


= (p-s)*t + 


m(i) * (s 


end 






> 


EG = G*j 


3D'; 




> 


VG = (G 


~2)*pD' - EG 


.-2; 


> 


SG =sqrt (VG) ; 




> 


disp([E( 


;';VG';SG'] ') 






1.0e+04 


* 






0.0931 


1.1561 


0.0108 




0.0936 


1.3117 


0.0115 




0.0939 


1.4869 


0.0122 




0.0942 


1.6799 


0.0130 




0.0943 


1.8880 


0.0137 




0.0944 


2.1075 


0.0145 




0.0943 


2.3343 


0.0153 




0.0941 


2.5637 


0.0160 




0.0938 


2.7908 


0.0167 




0.0934 


3.0112 


0.0174 




0.0929 


3.2206 


0.0179 



'/, Step by step calculation for various m 



m(i)); 



Example 12.6: A jointly distributed pair (Example 14 (Example 11.24: A jointly 
distributed pair) from "Mathematical Expectation; General Random Variables") 

Suppose the pair {X, Y} has joint density fxy (t, u) = 3m on the triangular region bounded by 
u = 0,u=l + t,u=l-t. Let Z = g {X, Y) = X 2 + 2XY. 

Determine E [Z] and Var [Z] . 

ANALYTIC SOLUTION 



E[Z] 
3/o/ ( 



l-t 



= / / (t 2 + 2tu) fxy (t, u) 
u(t 2 + 2tu) dudt = 1/10 



dudt 



3 /° x J 1+t u (t 2 + 2tu) dudt + (12.30) 



E [Z 2 ] = 3 



APPROXIMATION 



o r i+t 



1 Jo 



u(t 2 + 2tu) 2 dudt + 3 



o ^o 



u(t 2 + 2tu) 2 dudt = 3/35 



Var [Z] = E [Z 2 ] - E 2 [Z] = 53/700 w 0.0757 



(12.31) 
(12.32) 



tuappr 
Enter matrix [a b] of X-range endpoints [-1 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 400 
Enter number of Y approximation points 200 
Enter expression for joint density 3*u. * (u<=min(l+t , l-t) ) 
Use array operations on X, Y, PX, PY, t, u, and P 
G = t.~2 + 2*t.*u; '/. g(X,Y) = X~2 + 2XY 
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EG = total (G.*P) '/. E[g(X,Y)] 

EG = 0.1006 '/. Theoretical value = 1/10 

VG = total(G.~2.*P) - EG~2 

VG = 0.0765 '/. Theoretical value 53/700 = 0.0757 

[Z,PZ] = csort(G,P); '/. Distribution for Z 

EZ = Z+PZ' '/. E[Z] from distribution 

EZ = 0.1006 

VZ = Z.-2+PZ' - EZ~2 

VZ = 0.0765 

Example 12.7: A function with compound definition (Example 15 (Example 11.25: 
A function with a compound definition) from "Mathematical Expectation; General 
Random Variables") 

The pair {X, Y} has joint density fxy (t, u) = 1/2 on the square region bounded by u = 1 + t, 
u = 1 — t, u = 3 — t, and u = t — 1. 

X formax{X 7 Y\ < 1 
W = { ' ~ =I Q (X,Y)X + I Q o(X,Y)2Y 12.33 

2Y iormax{X, Y] > 1 

where Q = {(t, u) : max{t, u} < 1} = {(t, u) : t < l,u < 1}. 

Determine E [W] and Var [W]. 

ANALYTIC SOLUTION 

The intersection of the region Q and the square is the set for which < t < 1 and 1 — t < u < 1. 
Reference to Figure 11.3.2 shows three regions of integration. 

E[W] = - / tdudt+- / 2ududt+- / 2ududt = 11/6 w 1.8333 (12.34) 

2 ,/ J 1 _ t 2 J J 1 2 J x J t _ 1 

E [W 2 ] = - / t 2 dudt + - 4u 2 dudt + - / / 4u 2 dudi = 103/24 (12.35) 

2 Jo Ji-t 2 J J 1 2 J 1 J t _ 1 

Var [W] = 103/24 - (11/6) 2 = 67/72 w 0.9306 (12.36) 

tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density ( (u<=min(t+l ,3-t) )& ... 

(u$gt;=max(l-t,t-l)))/2 
Use array operations on X, Y, PX, PY, t, u, and P 
M = max(t,u)<=l; 

G = t.*M + 2*u.*(l - M); '/. Z = g(X,Y) 

EG = total (G.*P) '/. E[g(X,Y)] 

EG = 1.8340 '/. Theoretical 11/6 = 1.8333 

VG = total(G.~2.*P) - EG~2 

VG = 0.9368 '/. Theoretical 67/72 = 0.9306 

[Z,PZ] = csort(G,P); '/. Distribution for Z 

EZ = Z+PZ' '/. E[Z] from distribution 

EZ = 1.8340 
VZ = (Z."2)*PZ' - EZ~2 
VZ = 0.9368 
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Example 12.8: A function with compound definition 

f XY (t,u) =3 on < u < t 2 < 1 (12.37) 

Z = Iq (X, Y)X + Iqo (X, Y) for Q = {(t, u) : u + t < 1} (12.38) 

The value to where the line u = 1 — t and the curve u = t 2 meet satisfies t 2 = 1 — to- 

/•to rt /-l rl— t rl rt o 

E[Z} = 3 t dudt + 3 / t / dudt + 3 / / dudt = - (5t - 2) (12.39) 

./o -'O Jta JO Jta Jl-t ^ 

For E [Z 2 ] replace t by t 2 in the integrands to get E [Z 2 ] = (25t - 1) /20. 

Using t = (v/5 - 1) /2 w 0.6180, we get Var [Z] = (2125t - 1309) /80 w 0.0540. 
APPROXIMATION 

7, Theoretical values 
tO = (sqrt(5) - l)/2 
tO = 0.6180 
EZ = (3/4)*(5*t0 -2) 
EZ = 0.8176 
EZ2 = (25*t0 - l)/20 
EZ2 = 0.7225 

VZ = (2125*t0 - 1309)/80 
VZ = 0.0540 
tuappr 

Enter matrix [a b] of X-range endpoints [0 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density 3*(u <= t.~2) 
Use array operations on X, Y, t, u, and P 
G = (t+u <= l).*t + (t+u > 1); 
EG = total (G.*P) 

EG = 0.8169 '/. Theoretical = 0.8176 

VG = total(G.~2.*P) - EG~2 

VG = 0.0540 '/. Theoretical = 0.0540 

[Z,PZ] = csort(G,P); 
EZ = Z+PZ' 
EZ = 0.8169 
VZ = (Z.~2)*PZ' - EZ~2 
VZ = 0.0540 

Standard deviation and the Chebyshev inequality 

In Example 5 (Example 10.5: The normal distribution and standardized normal distribution) from "Func- 
tions of a Random Variable," we show that if X ~ N (/i, a 2 ) then Z = ^^ ~ N (0, 1). Also, E [X] = \x 
and Var [X] = a 2 . Thus 

p f l-X-Ml < A = P (\X - li\ < ta) = 2$ (t) - 1 (12.40) 

For the normal distribution, the standard deviation a seems to be a natural measure of the variation away 
from the mean. 

For a general distribution with mean /i and variance a 2 , we have the 
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Chebyshev inequality 



\X 



> a ] < -~ or P (\X - n\ > oo) < -^ 
a ' a z 



(12.41) 



In this general case, the standard deviation appears as a measure of the variation from the mean value. This 
inequality is useful in many theoretical applications as well as some practical ones. However, since it must 
hold for any distribution which has a variance, the bound is not a particularly tight. It may be instructive 
to compare the bound on the probability given by the Chebyshev inequality with the actual probability for 
the normal distribution. 



t = 1:C 


1.5:3; 






p = 2*(1 - 


gaussian(0, 


l,t)); 




c = ones(l, 


length (t)) . 


/(t.~2); 




r = c./p; 








h = [' 


t Chebyshev 


Prob 


m = [t ; c ; p ; 


r]>; 






disp(h) 








t 


Chebyshet 


Prob 


Ratio 


disp(m) 








1.0000 


1.0000 


0.3173 


3.1515 


1.5000 


. 4444 


0.1336 


3.3263 


2.0000 


0.2500 


0.0455 


5.4945 


2.5000 


0.1600 


0.0124 


12.8831 


3.0000 


0.1111 


0.0027 


41.1554 



Ratio'] ; 



□ 

DERIVATION OF THE CHEBYSHEV INEQUALITY 

Let A = {\X - n\ > aa} = {{X - a) 2 > a 2 a 2 }. Then a 2 a 2 I A < (X - /i) 2 
Upon taking expectations of both sides and using monotonicity, we have 



a 2 a 2 P{A) < E \{X-n)' 



from which the Chebyshev inequality follows immediately. 
— □ 

We consider three concepts which are useful in many situations. 
Definition. A random variable X is centered iff E [X] = 0. 

X = X — fj, is always centered. 
Definition. A random variable X is standardized iff E [X] = and Var [X] = 1. 

t X-n X' 

X = = — is standardized 

a a 

Definition. A pair {X, Y} of random variables is uncorrelated iff 



(12.42) 



(12.43) 



(12.44) 



E [XY] - E [X] E [Y] = 



(12.45) 



It is always possible to derive an uncorrelated pair as a function of a pair {X, Y}, both of which have finite 
variances. Consider 



U={X*+Y*) V=(X*-Y*), where X* = — — ^-, Y* = - — ^ 



<?x 



ay 



(12.46) 
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Now E [U] = E [V] = and 

E [UV] = E{X* + Y*) {X* -Y*) = E \{X*f] - E \{Y*f] =1-1 = (12.47) 

so the pair is uncorrelated. 

Example 12.9: Determining an uncorrelated pair 

We use the distribution for Examples Example 10 (Example 11.10: Expectation for Z = g (X, Y)) 
from "Mathematical Expectation: Simple Random Variables" and Example 12.3 (Z = g (X, Y) 
(Example 10 (Example 11.10: Expectation for Z = g(X, Y)) from "Mathematical Expectation: 
Simple Random Variables")), for which 

E [XY] - E[X]E \Y] =£ (12.48) 



jdemol 
jcalc 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
EX = total (t.*P) 
EX = 0.6420 
EY = total (u.*P) 
EY = 0.0783 
EXY = total (t.*u.*P) 
EXY = -0.1130 
c = EXY - EX*EY 
c = -0.1633 7, {X,Y}- not uncorrelated 

VX = total (t.~2.*P) - EX~2 
VX = 3.3016 

VY = total(u.~2.*P) - EY"2 
VY = 3.6566 
SX = sqrt(VX) 
SX = 1.8170 
SY = sqrt(VY) 
SY = 1.9122 

x = (t - EX)/SX; °/. Standardized random variables 

y = (u - EY)/SY; 

uu = x + y; °/. Uncorrelated random variables 

vv = x - y; 

EUV = total (uu. *vv. *P) '/, Check for uncorrelated condition 
EUV = 9.9755e-06 '/, Differs from zero because of roundoff 



356 CHAPTER 12. VARIANCE, COVARIANCE, LINEAR REGRESSION 

12.2 Covariance and the Correlation Coefficient 2 
12.2.1 Covariance and the Correlation Coefficient 

The mean value fix = E [X] and the variance a x = E (X — fix) give important information about the 
distribution for real random variable X. Can the expectation of an appropriate function of (X, Y) give useful 
information about the joint distribution? A clue to one possibility is given in the expression 

Var [X±Y] = Var [X] + Var [Y] ±2{E [XY] - E[X]E [Y]) (12.49) 

The expression E [XY] — E [X] E [Y] vanishes if the pair is independent (and in some other cases). We note 
also that for p x = E [X] and p Y = E [Y] 

E [(X - n x ) (Y - u Y )} = E [XY] - fxxl^y (12.50) 

To see this, expand the expression (X — fix) (Y — fiy) and use linearity to get 

E [(X - ti x ) (Y - fiy)] = E[XY- fiyX - nxY + Wy] = E [XY] - fi Y E [X] - (12.51) 
HxE [Y] + iixiiy 

which reduces directly to the desired expression. Now for given to, X (u>) — fix is the variation of X from 
its mean and Y (to) — fiy is the variation of Y from its mean. For this reason, the following terminology is 
used. 

Definition. The quantity Cov [X, Y] = E [(X — fix) (Y — fiy)] is called the covariance of X and Y. 

If we let X' = X — fix and Y = Y — fiy be the centered random variables, then 

Cov [X, Y] = E [X'Y'} (12.52) 

JVote that the variance of X is the covariance of X with itself. 

If we standardize, with X* = (X — fix) /ex and Y* = (Y — fiy) jay, we have 
Definition. The correlation coefEcientp = p [X, Y] is the quantity 

p [X, Y] = E [X*Y*] = E[(X-p x )(Y-py)] ^^ 

crxvy 

Thus p = Cov [X, Y] /crxvy- We examine these concepts for information on the joint distribution. By 
Schwarz' inequality (E15), we have 

p 2 = E 2 [X*Y*] < E \{X*f] E \{Y*f] = 1 with equality iff Y* = cX* (12.54) 

Now equality holds iff 

1 = C 2 E 2 \{X*) 2 ] = c 2 which implies c = ±1 and p = ±1 (12.55) 

We conclude -1 < p < 1, with p= ± 1 iff Y* = ± X* 

Relationship between p and the joint distribution 

• We consider first the distribution for the standardized pair (X*, Y*) 

• Since P {X* < r, Y* < s) = P (^^ < r, ^^ < s\ 

= P(X <t = o x r + fix, Y <u = ays + fiy) (12.56) 



2 This content is available online at <http://cnx.Org/content/m23460/l.5/>. 
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we obtain the results for the distribution for (X, Y) by the mapping 

t = a x r + fix 
U = ays + py 

Joint distribution for the standardized variables (X*, Y*), (r, s) = (X*,Y*) (u>) 

p = 1 iff X* = Y* iff all probability mass is on the line s = r. 
p = — 1 iff X* = —Y* iff all probability mass is on the line s = —r. 

If — 1 < p < 1, then at least some of the mass must fail to be on these lines. 



(12.57) 



(r,s) 




Figure 12.1: Distance from point (r,s) to the line s = r. 



The p = ±1 lines for the (X, Y) distribution are: 

U - fly t- fl X 



±- 



Oy 



or u = ± — (t - fi x ) + py 
ox 



(12.58) 



ay a x 

Consider Z = Y* - X* . Then E [\Z 2 ] = \E \(Y* - X*f\. Reference to Figure 12.1 shows this is the 

average of the square of the distances of the points (r, s) = (X* , Y*) (uj) from the line s = r (i.e., the variance 
about the line s = r). Similarly for W = Y* + X* , E [W 2 /2] is the variance about s = —r. Now 



\E [(Y* ± X*) 2 ] = \{E [(Y*) 2 ] + E [(X*) 2 ] ± IE [X*Y*]} = 1 ± , 



(12.59) 



Thus 



1 — p is the variance about s = r (the p = 1 line) 
1 + p is the variance about s = —r (the p = —1 line) 



Now since 



E \{Y* -X*) 2 ] = E \{Y* + X*) 2 ] iff p= E\X*Y*] = 



(12.60) 
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the condition p = is the condition for equality of the two variances. 
Transformation to the (X, Y) plane 



The p = 1 line is: 



t = a x r + px 



p Y 



ays + p Y 



t- px 
ox 



p Y 



<J Y 



The p 



■1 line is: 



cry 



p Y 



a Y 



t- px 
ox 



t- px 
ox 



or 



or 



Qy 
ox 



(t- px) + Vy 



oy 
ox 



(t- px) + (J-y 



(12.61) 



(12.62) 



(12.63) 



1 — p is proportional to the variance abut the p = 1 line and 1 + p is proportional to the variance about the 
p = — 1 line, p = iff the variances about both are the same. 

Example 12.10: Uncorrelated but not independent 

Suppose the joint density for {X, Y} is constant on the unit circle about the origin. By the rectangle 
test, the pair cannot be independent. By symmetry, the p = 1 line is u = t and the p = — 1 line is 
u = —t. By symmetry, also, the variance about each of these lines is the same. Thus p = 0, which 
is true iff Cov [X, Y] = 0. This fact can be verified by calculation, if desired. 

Example 12.11: Uniform marginal distributions 



(1,1) 



(1,1) 




(b) rho = 3/4 



(c) rho = -3/4 



Figure 12.2: Uniform marginals but different correlation coefficients. 
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Consider the three distributions in Figure 12.2. In case (a), the distribution is uniform over 
the square centered at the origin with vertices at (1,1), (-1,1), (-1,-1), (1,-1). In case (b), the 
distribution is uniform over two squares, in the first and third quadrants with vertices (0,0), (1,0), 
(1,1), (0,1) and (0,0), 

(-1,0), (-1,-1), (0,-1). In case (c) the two squares are in the second and fourth quadrants. The 
marginals are uniform on (-1,1) in each case, so that in each case 

E [X] = E [Y] = and Var [X] = Var [Y] = 1/3 (12.64) 

This means the p = 1 line is u = t and the p = —1 line is u = — t. 

a. By symmetry, E [XY] = (in fact the pair is independent) and p = 0. 

b. For every pair of possible values, the two signs must be the same, so E [XY] > which implies 
p > 0. The actual value may be calculated to give p = 3/4. Since 1 — p < 1 + p, the variance 
about the p = 1 line is less than that about the p = — 1 line. This is evident from the figure. 

c. E [XY] < and p < 0. Since 1 + p < 1 — p, the variance about the p = —1 line is less than 
that about the p = 1 line. Again, examination of the figure confirms this. 

Example 12.12: A pair of simple random variables 

With the aid of m-functions and MATLAB we can easily caluclate the covariance and the correlation 
coefficient. We use the joint distribution for Example 9 (Example 12.9: Determining an uncorrelated 
pair) in "Variance." In that example calculations show 

E [XY] - E [X] E [Y] = -0.1633 = Cov [X, Y] , <r x = 1.8170 and a Y = 1.9122 (12.65) 

so that p = -0.04699. 

Example 12.13: An absolutely continuous pair 

The pair {X, Y} has joint density function fxy (t, u) = | (t + 2u) on the triangular region bounded 
by t = 0,u = t, and u = 1. By the usual integration techniques, we have 

6 
f x (f) = - (1 + t - 2t 2 ) , < t < 1 and f Y (u) = 3m 2 , < u < 1 (12.66) 

5 

From this we obtain E [X] = 2/5, Var [X] = 3/50, E [Y] = 3/4, and Var \Y] = 3/80. To 
complete the picture we need 

E [XY] = - I I (t 2 u + 2tu 2 ) dudt = 8/25 (12.67) 

Then 

Cov [X, Y] = E [XY] - E [X] E [Y] = 2/100 and p= ° V ^ ' = — VlO^ 0.4216 (12.68) 

uxvy 30 

APPROXIMATION 

tuappr 
Enter matrix [a b] of X-range endpoints [0 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density (6/5)*(t + 2*u).*(u>=t) 
Use array operations on X, Y, PX, PY, t, u, and P 
EX = total (t.*P) 
EX = 0.4012 '/. Theoretical = 0.4 
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EY = total (u.*P) 

EY = 0.7496 

VX = total(t.~2.*P) - EX~2 

VX = 0.0603 

VY = total(u.~2.*P) - EY~2 

VY = 0.0376 

CV = total(t.*u.*P) - EX+EY 

CV = 0.0201 

rho = CV/sqrt(VX*VY) 

rho = 0.4212 



7, Theoretical = 0.75 

'/, Theoretical = 0.06 

'/. Theoretical = 0.0375 

'/. Theoretical = 0.02 

'/. Theoretical = 0.4216 



Coefficient of linear correlation 

The parameter p is usually called the correlation coefficient. A more descriptive name would be coefficient 
of linear correlation. The following example shows that all probability mass may be on a curve, so that 
Y = g (X) (i.e., the value of Y is completely determined by the value of X), yet p = 0. 

Example 12.14: Y = g (X) but p = 

Suppose X ~ uniform (-1,1), so that f x (t) = 1/2, - 1 < t < 1 and E [X] = 0. Let Y = g (X) = 

cosX. Then 



Cov [X,Y] = E[XY]= l -j^ 



tcos t dt = 



(12.69) 



Thus p = 0. Note that g could be any even function defined on (-1,1). In this case the integrand 
tg (t) is odd, so that the value of the integral is zero. 

Variance and covariance for linear combinations 

We generalize the property (V4) ("(V4)", p. 346) on linear combinations. Consider the linear combina- 
tions 



X = Y^ aiX, and Y = ^ h Y i (12.70) 

»=i j=i 

We wish to determine Cov [X, Y] and Var [X]. It is convenient to work with the centered random variables 
X' = X — px and Y' = Y — p y . Since by linearity of expectation, 



fi x = Y o-tPXi and u. Y = ^ bjfj, Yj 

i=\ j=i 



(12.71) 



we have 



X' = y^ a i X l - ^ ciipxi =^ a i ( x i 



VXi 



J2 a * X i 



7 = 1 



and similarly for Y . By definition 

Cov (X,Y) = E[X'Y'} = E 
In particular 



E 8 '^^' 



hj 



>-,j 



■i -j 



(12.72) 



]T atbjE [x' t Yj] = J2 aibjCov (X i: Y 3 ) (12.73) 



Var {X) = Cov {X, X) = ^ a^Cov {X u Xj) = ^ a?Cov (X h Xi) + ^ a^-Cov (X h Xj) (12.74) 
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Using the fact that ajOjCov (X,, X,-) = ajOjCov (Xj,Xj), we have 

n 

Var [X] = Y] a- Var [X t ] + 2 ^ a^-Cov {X^Xj) (12.75) 

i— 1 z<j 

Note that aj 2 does not depend upon the sign of aj. If the Xj form an independent class, or are otherwise 
uncorrected, the expression for variance reduces to 

n 

Var [X] = Y] a- Var [Xi] (12.76) 

4 = 1 

12.3 Linear Regression 3 
12,3.1 Linear Regression 

Suppose that a pair {X, Y} of random variables has a joint distribution. A value X (u>) is observed. It is 
desired to estimate the corresponding value Y (w). Obviously there is no rule for determining Y (w) unless 

Y is a function of X. The best that can be hoped for is some estimate based on an average of the errors, or 
on the average of some function of the errors. 

Suppose X (w) is observed, and by some rule an estimate Y (u>) is returned. The error of the estimate is 

Y (w) — Y (w). The most common measure of error is the mean of the square of the error 

21 



E 



Y-Y 



(12.77) 



The choice of the mean square has two important properties: it treats positive and negative errors alike, 
and it weights large errors more heavily than smaller ones. In general, we seek a rule (function) r such that 

the estimate Y (w) is r (X (w)). That is, we seek a function r such that 

e\(Y -r (X)) 2 ] is a minimum. (12.78) 

The problem of determining such a function is known as the regression problem. In the unit on Regression 
(Section 14.1.5: The regression problem), we show that this problem is solved by the conditional expectation 
of Y, given X. At this point, we seek an important partial solution. 
The regression line of Y on X 

We seek the best straight line function for minimizing the mean squared error. That is, we seek a function 
r of the form u = r (t) = at + b. The problem is to determine the coefficients a, b such that 

E \(Y - aX - bf] is a minimum (12.79) 

We write the error in a special form, then square and take the expectation. 

Error = Y - aX - b = (Y - u. Y ) - a {X - u. x ) + Hy - afi x - b = (Y - u. Y ) - a (X - /j, x ) - f3 (12.80) 



Error squared = ( Y - fi Y ) 2 + a 2 (X - fi x ) 2 + (3 2 - 2(3 (Y - fi Y ) + 2o/3 (X - fi x ) - (12.81) 
2a(Y-fi Y )(X-fi x ) 



3 This content is available online at <http://cnx.Org/content/m23468/l.5/>. 
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E[(Y- 



aX -bf\ = o-y + a 2 a x + 1 - 2aCov [X, Y] (12.82) 

Standard procedures for determining a minimum (with respect to a) show that this occurs for 

Cov \X, Y] 
a = Var [X] b = MY " ^ X (12 - 83) 

Thus the optimum line, called the regression line of Y on X, is 

Cov \X, Y] , . Gy , , . , 

u = - ' (t - fix) + u, Y = p—(t- n x ) + fi Y = a (t) 12.84 

Var [A J ax 

The second form is commonly used to define the regression line. For certain theoretical purposes, this is 
the preferred form. But for calculation, the first form is usually the more convenient. Only the covariance 
(which requres both means) and the variance of X are needed. There is no need to determine Var [Y] or p. 

Example 12.15: The simple pair of Example 3 (Example 12.3: Z = g (X, Y) (Example 
10 (Example 11.10: Expectation for Z = g(X, Y)) from "Mathematical Expectation: 
Simple Random Variables")) from "Variance" 

jdemol 
jcalc 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
EX = total (t.*P) 
EX = 0.6420 
EY = total (u.*P) 
EY = 0.0783 

VX = total(t.~2.*P) - EX~2 
VX = 3.3016 

CV = total (t.*u.*P) - EX+EY 
CV = -0.1633 
a = CV/VX 
a = -0.0495 
b = EY - a*EX 
b = 0.1100 '/. The regression line is u = -0.0495t + 0.11 

Example 12.16: The pair in Example 6 (Example 12.6: A jointly distributed pair 
(Example 14 (Example 11.24: A jointly distributed pair) from "Mathematical Expec- 
tation; General Random Variables")) from "Variance" 

Suppose the pair {X, Y} has joint density Jxy (t, u) = 3m on the triangular region bounded by 
u = 0, u = 1 + t, u = 1 — t. Determine the regression line of Y on X. 
ANALYTIC SOLUTION 

By symmetry, E [X] = E [XY] = 0, so Cov [X, Y] = 0. The regression curve is 

t'l pl — U ('1 

u = E[Y]=3 u 2 dtdu = 6 / u 2 (1 - u) du = 1/2 (12.85) 

JO iti-l Jo 

Note that the pair is uncorrelated, but by the rectangle test is not independent. With zero values of 
E [X] and E [XY], the approximation procedure is not very satisfactory unless a very large number 
of approximation points are employed. 
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Example 12.17: Distribution of Example 5 (Example 8.11: Marginal distribution with 
compound expression) from "Random Vectors and MATLAB" and Example 12 (Ex- 
ample 10.26: Continuation of Example 5 (Example 8.5: Marginals for a discrete distri- 
bution) from "Random Vectors and Joint Distributions") from "Function of Random 
Vectors" 

The pair {X, Y} has joint density fxy {t, u) = §f {t + 2u) on the region < £ < 2, < u < 
max{l, t} (see Figure Figure 12.3). Determine the regression line of Y on X. If the value X (uj) = 1.7 
is observed, what is the best mean-square linear estimate of Y (uj)? 




Figure 12.3: Regression line for Example 12.17 (Distribution of Example 5 (Example 8.11: Marginal 
distribution with compound expression) from "Random Vectors and MATLAB" and Example 12 (Exam- 
ple 10.26: Continuation of Example 5 (Example 8.5: Marginals for a discrete distribution) from "Random 
Vectors and Joint Distributions") from "Function of Random Vectors"). 



ANALYTIC SOLUTION 



E[X] 



6 
37 Jo 



o 



{f + 2tu) dudt + 



37 



(t 2 + 2tu) dudt = 50/37 



(12.86) 



The other quantities involve integrals over the same regions with appropriate integrands, as follows: 



Quantity 


Integrand 


Value 


E[X 2 ] 


t 3 + 2t 2 u 


779/370 


E[Y] 


tu + 2u 2 


127/148 


E[XY] 


t 2 u + 2tu 2 


232/185 



Table 12.1 
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Then 



Var [X] 



779 
370 



3823 ^ _ _ 232 50 127 1293 /,„ ^ 

Cov X, y = = 12.87 

13690 L J 185 37 148 13690 v ; 



and 



a = Cov [X, Y] /Var [X] 



1293 



0.3382, b= E[Y}-aE [X] 



6133 



0.4011 



3823 L J L J 15292 

The regression line is u = at + b. If X (ui) = 1.7, the best linear estimate (in the mean square 

sense) is Y (<^) = 1.7a + b = 0.9760 (see Figure 12.3 for an approximate plot). 
APPROXIMATION 

tuappr 
Enter matrix [a b] of X-range endpoints [0 2] 
Enter matrix [c d] of Y-range endpoints [0 2] 
Enter number of X approximation points 400 
Enter number of Y approximation points 400 

Enter expression for joint density (6/37) *(t+2*u) . *(u<=max(t , 1)) 
Use array operations on X, Y, PX, PY, t, u, and P 
EX = total (t.*P) 

EX = 1.3517 '/. Theoretical = 1.3514 

EY = total (u.*P) 
EY = 0.8594 

VX = total(t.~2.*P) - EX~2 
VX = 0.2790 

CV = total (t.*u.*P) - EX+EY 
CV = 0.0947 
a = CV/VX 
a = 0.3394 
b = EY - a*EX 
b = 0.4006 
y = 1.7*a + b 
y = 0.9776 



(12.88) 



'/. Theoretical = 0.8581 

'/. Theoretical = 0.2793 

'/. Theoretical = 0.0944 

'/. Theoretical = 0.3382 

'/. Theoretical = 0.4011 

'/. Theoretical = 0.9760 



An interpretation of p 2 

The analysis above shows the minimum mean squared error is given by 



E 



Y-Y 



Y - p — (X - p x ) - [i-Y 
ox 



= <j y e\(Y* -pX*) 2 ] (12.89) 

a Y E [(y*) 2 - 2pX*Y* + p 2 (X*) 2 ] = a Y (l - 2p 2 + p 2 ) = a Y (l - p 2 ) (12.90) 



If p = 0, then E 



Y-Y 



<7y, the mean squared error in the case of zero linear correlation. Then, 



p 2 is interpreted as the fraction of uncertainty removed by the linear rule and X. This interpretation should 
not be pushed too far, but is a common interpretation, often found in the discussion of observations or 
experimental results. 

More general linear regression 
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Consider a jointly distributed class. {Y,X\,X%, ■ ■ ■ ,X n }. We wish to deterimine a function U of the 
form 

n 

U = Y^ (liXi, with Xq = 1, such that E \(Y — U) \ is a minimum (12.91) 

*=o 

If U satisfies this minimum condition, then E [(Y — U) V] = 0, or, equivalently 

n 

E [YV] = E [UV] for all V of the form V = ^ c i x i (12.92) 

»=o 

To see this, set W = Y - U and let d 2 = E [W 2 ] . Now, for any a 

d 2 <E Uw + aV) 2 } =d 2 + 2aE [WV] + a 2 E [V 2 ] (12.93) 

If we select the special 

E[WV] , 2E[WV] 2 E[WV} 2 ^r r21 

a = [ -r^- then 0< L r 01 J + — y - U-E \V 2 ] 12.94 

E[V 2 } ~ E[V 2 } E[V 2 } 2 L J 

This implies E[WV} 2 < 0, which can only be satisfied by E [WV] = 0, so that 

E [YV] = E [UV] (12.95) 

On the other hand, if E [(Y — U) V] = for all V of the form above, then E \(Y — U) is a minimum. 
Consider 

E \(Y - V) 2 ^ =e\(Y-U + U- F) 2 ] = E \(Y - C/) 2 ] +e\(U- vf^ + 2E [{Y - U) (U - V)} (12.96) 

Since U — V is of the same form as V, the last term is zero. The first term is fixed. The second term is 
nonnegative, with zero value iff U — V = a.s. Hence, E \(Y — V) is a minimum when V = U. 

If we take V to be 1, X\, X%, • • • , X n , successively, we obtain n+1 linear equations in the n+ 1 unknowns 
ao, ai, • • ■ , a n , as follows. 

1. E [Y] = a + aiE [X^ + ■ ■ ■ + a n E [X n ] 

2. E [YXi] = a E \X % ] + ai E [Xrfi] + h a n E [X n X z } for 1 < i < n 

For each i = 1,2, •■■ ,n, we take (2) — E [Xj] • (1) and use the calculating expressions for variance and 
covariance to get 

Cov [Y, X,} = aiCov [X u Xi] + a 2 Cov [X 2l X t } + --- + a„Cov [X n , X t ] (12.97) 

These n equations plus equation (1) may be solved alagebraically for the a;. 

In the important special case that the X; are uncorrelated (i.e., Cov [X^X,] = for i ^ j) 5 we have 

Cov lY.XA 
*i = ^M ^^ n (12 - 98) 

and 

a a = E [Y] - ai E [X 1 ] - a 2 E [X 2 ] a n E [X n ] (12.99) 

In particular, this condition holds if the class {Xi : 1 < i < n) is iid as in the case of a simple random 
sample (see the section on "Simple Random Samples and Statistics (Section 13.3)). 
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Examination shows that for n = 1, with X\ = X, ao = b, and oi = a, the result agrees with that obtained 
in the treatment of the regression line, above. 

Example 12.18: Linear regression with two variables. 

Suppose E [Y] = 3, E [X x ] = 2, E [X 2 ] = 3, Var [Xi] = 3, Var [X 2 ] = 8, Cov [Y, Xi] = 5, 
Cov [V, X2] = 7, and Cov [ATi, V 2 ] = 1- Then the three equations are 



(12.100) 



«0 


+ 2a 2 


+ 3a 3 = 


= 3 





+ 3a! 


+ lo 2 = 


= 5 





+ loi 


+ 8a 2 = 


= 7 



Solution of these simultaneous linear equations with MATLAB gives the results 
a = -1.9565, ai = 1.4348, and o 2 = 0.6957. 



12.4 Problems on Variance, Covariance, Linear Regression 4 

Exercise 12.1 (Solution on p. 374.) 

(See Exercise 1 (Exercise 7.1) from "Problems on Distribution and Density Functions ", and Ex- 
ercise 1 (Exercise 11.1) from "Problems on Mathematical Expectation", m-file npr07_01.m (Sec- 
tion 17.8.30: npr07_01)). The class {Cj : 1 < j < 10} is a partition. Random variable X has 
values {1,3,2,3,4,2,1,3,5,2} on Cj through C10, respectively, with probabilities 0.08, 0.13, 0.06, 
0.09, 0.14, 0.11, 0.12, 0.07, 0.11, 0.09. Determine Var [X]. 

Exercise 12.2 (Solution on p. 374.) 

(See Exercise 2 (Exercise 7.2) from "Problems on Distribution and Density Functions ", and Ex- 
ercise 2 (Exercise 11.2) from "Problems on Mathematical Expectation", m-file npr07_02.m (Sec- 
tion 17.8.31: npr07_02)). A store has eight items for sale. The prices are $3.50, $5.00, $3.50, $7.50, 
$5.00, $5.00, $3.50, and $7.50, respectively. A customer comes in. She purchases one of the items 
with probabilities 0.10, 0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing the 
amount of her purchase may be written 

X = 3.5/ Cl + 5.0/ C2 + 3.5/ C3 + 7.5/c 4 + 5.0J Cb + 5.0/ Ce + 3.5/cv + 7.5/ Cs (12.101) 

Determine Var [X]. 

Exercise 12.3 (Solution on p. 374.) 

(See Exercise 12 (Exercise 6.12) from "Problems on Random Variables and Probabilities", Ex- 
ercise 3 (Exercise 11.3) from "Problems on Mathematical Expectation", m-file npr06_12.m (Sec- 
tion 17.8.28: npr06_12)). The class {A, B, C, D} has minterm probabilities 

pm = 0.001 * [5 7 6 8 9 14 22 33 21 32 50 75 86 129 201 302] (12.102) 

Consider X = I a + Ib + Ic + Id, which counts the number of these events which occur on a trial. 
Determine Var [X]. 

Exercise 12.4 (Solution on p. 374.) 

(See Exercise 4 (Exercise 11.4) from "Problems on Mathematical Expectation"). In a thunderstorm 
in a national park there are 127 lightning strikes. Experience shows that the probability of each 
lightning strike starting a fire is about 0.0083. Determine Var [X]. 



4 This content is available online at <http://cnx.Org/content/m24379/l.4/>. 
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Exercise 12.5 (Solution on p. 374.) 

(See Exercise 5 (Exercise 11.5) from "Problems on Mathematical Expectation"). Two coins are 
flipped twenty times. Let X be the number of matches (both heads or both tails). Determine 
Var [X]. 

Exercise 12.6 (Solution on p. 374.) 

(See Exercise 6 (Exercise 11.6) from "Problems on Mathematical Expectation"). A residential 
College plans to raise money by selling "chances" on a board. Fifty chances are sold. A player pays 
$10 to play; he or she wins $30 with probability p = 0.2. The profit to the College is 

X = 50 • 10 - 30N, where N is the number of winners (12.103) 

Determine Var [X]. 

Exercise 12.7 (Solution on p. 374.) 

(See Exercise 7 (Exercise 11.7) from "Problems on Mathematical Expectation"). The number of 
noise pulses arriving on a power circuit in an hour is a random quantity X having Poisson (7) 
distribution. Determine Var [Xj. 

Exercise 12.8 (Solution on p. 374.) 

(See Exercise 24 (Exercise 7.24) from "Problems on Distribution and Density Functions", and 
Exercise 8 (Exercise 11.8) from "Problems on Mathematical Expectation"). The total operating 
time for the units in Exercise 24 (Exercise 7.24) from "Problems on Distribution and Density 
Functions" is a random variable T ~ gamma (20, 0.0002). Determine Var [T]. 

Exercise 12.9 (Solution on p. 374.) 

The class {A, B, C, D, E, F} is independent, with respective probabilities 
0.43, 0.53, 0.46, 0.37, 0.45, 0.39. Let 

X = 6I A + 1Mb ~ 87c, Y = -M D + AI E + I F -7 (12.104) 

a. Use properties of expectation and variance to obtain E [X], Var [X], E [Y], and Var [Y]. Note 
that it is not necessary to obtain the distributions for X or Y. 

b. Let Z =3Y - 2X. 
Determine E [Z], and Var \Z\. 

Exercise 12.10 (Solution on p. 375.) 

Consider X = -3.3/ A - 1.7 I B + 2.37 c + 7.67 D - 3.4. The class {A, B, C, D] has minterm 
probabilities (data are in m-file nprl2_10.m (Section 17.8.43: nprl2_10)) 

pmx= [0.0475 0.0725 0.0120 0.0180 0.1125 0.1675 0.0280 0.0420 ••• (12.105) 



0.0480 0.0720 0.0130 0.0170 0.1120 0.1680 0.0270 0.0430 (12.106) 

a. Calculate E [X] and Var [X]. 

b. Let W = 2X 2 - 3X + 2. 
Calculate E [W] and Var [W]. 

Exercise 12.11 (Solution on p. 375.) 

Consider a second random variable Y = 107b+177f+207g — 10 in addition to that in Exercise 12.10. 
The class {E, F, G} has minterm probabilities (in mfile nprl2_10.m (Section 17.8.43: nprl2_10)) 

pmy=[0.06 0.14 0.09 0.21 0.06 0.14 0.09 0.21] (12.107) 
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The pair {X, Y} is independent. 

a. Calculate E [Y] and Var [Y]. 

b. Let Z = X 2 + 2XY - Y. 
Calculate E [Z] and Var \Z\. 

Exercise 12.12 (Solution on p. 376.) 

Suppose the pair {X, Y} is independent, with X ~ gamma (3,0.1) and 
Y ~ Poisson (13). Let Z = 2X - 5Y. Determine E [Z] and Var [Z]. 

Exercise 12.13 (Solution on p. 376.) 

The pair {X, Y} is jointly distributed with the following parameters: 

E[X]=3, E\Y]=A, E[XY] = 15, E [X 2 ] = 11, Var [Y] = 5 (12.108) 

Determine Var [3X - 2Y]. 

Exercise 12.14 (Solution on p. 376.) 

The class {A,B,C,D,E,F} is independent, with respective probabilities 

0.47, 0.33, 0.46, 0.27, 0.41, 0.37 (12.109) 

Let 

X = 8I A + Ills - 7I C ,Y = -3I D + 5I E + If - 3,andZ = 3Y - 2X (12.110) 



a. Use properties of expectation and variance to obtain E [X], Var [X], E [Y], and Var [Y]. 

b. Determine E [Z], and Var \Z\. 

c. Use appropriate m-programs to obtain i?[X], Var [X], E[Y], Var[V], E[Z], and Var[Z]. 
Compare with results of parts (a) and (b). 

Exercise 12.15 (Solution on p. 377.) 

For the Beta (r, s) distribution, 

a. Determine E [X n ], where n is a positive integer. 

b. Use the result of part (a) to determine E [X] and Var [X]. 

Exercise 12.16 (Solution on p. 377.) 

The pair {X, Y} has joint distribution. Suppose 

E[X} = 3, E[X 2 }=11, E[Y] = 10, £[y 2 ]=101, E [XY] = 30 (12.111) 

Determine Var [15V - 2Y\. 

Exercise 12.17 (Solution on p. 377.) 

The pair {X, Y} has joint distribution. Suppose 

E[X] = 2, E[X 2 }=5, E[Y] = 1, E [Y 2 ] = 2, E[XY] = 1 (12.112) 

Determine Var \3X + 2Y]. 

Exercise 12.18 (Solution on p. 378.) 

The pair {X, Y} is independent, with 

E [X] = 2, E [Y] = 1, Var [X] = 6, Var \Y] = 4 (12.113) 
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Let Z = 2X 2 + XY 2 - 3Y + 4.. 
Determine E [Z] . 

Exercise 12.19 (Solution on p. 378.) 

(See Exercise 9 (Exercise 11.9) from "Problems on Mathematical Expectation"). Random variable 
X has density function 



fx{t) ={ 



^ [0, 1] («) ft 2 + ^(1,2] (*) | (2 - t) 



(6/5) t 2 for < t < 1 

(6/5) (2 - t) for 1 < t < 2 

E[X] = 11/10. Determine Var [X]. 

For the distributions in Exercises 20-22 

Determine Var [X], Cov [X, Y], and the regression line of Y on X. 

Exercise 12.20 (Solution on p. 378.) 

(See Exercise 7 (Exercise 8.7) from "Problems On Random Vectors and Joint Distributions", and 
Exercise 17 (Exercise 11.17) from "Problems on Mathematical Expectation"). The pair {X, Y} 
has the joint distribution (in file npr08_07.m (Section 17.8.38: npr08_07)): 



(12.114) 



P(X = t, Y = u) 



(12.115) 



t = 


-3.1 


-0.5 


1.2 


2.4 


3.7 


4.9 


u = 7.5 


0.0090 


0.0396 


0.0594 


0.0216 


0.0440 


0.0203 


4.1 


0.0495 





0.1089 


0.0528 


0.0363 


0.0231 


-2.0 


0.0405 


0.1320 


0.0891 


0.0324 


0.0297 


0.0189 


-3.8 


0.0510 


0.0484 


0.0726 


0.0132 





0.0077 



Table 12.2 

Exercise 12.21 (Solution on p. 378.) 

(See Exercise 8 (Exercise 8.8) from "Problems On Random Vectors and Joint Distributions", and 
Exercise 18 (Exercise 11.18) from "Problems on Mathematical Expectation"). The pair {X, Y} 
has the joint distribution (in file npr08_08.m (Section 17.8.39: npr08_08)): 



P(X = t, Y = u) 



(12.116) 



t = 


1 


3 


5 


7 


9 


11 


13 


15 


17 


19 


u = 12 


0.0156 


0.0191 


0.0081 


0.0035 


0.0091 


0.0070 


0.0098 


0.0056 


0.0091 


0.0049 


10 


0.0064 


0.0204 


0.0108 


0.0040 


0.0054 


0.0080 


0.0112 


0.0064 


0.0104 


0.0056 


9 


0.0196 


0.0256 


0.0126 


0.0060 


0.0156 


0.0120 


0.0168 


0.0096 


0.0056 


0.0084 


5 


0.0112 


0.0182 


0.0108 


0.0070 


0.0182 


0.0140 


0.0196 


0.0012 


0.0182 


0.0038 


3 


0.0060 


0.0260 


0.0162 


0.0050 


0.0160 


0.0200 


0.0280 


0.0060 


0.0160 


0.0040 


-1 


0.0096 


0.0056 


0.0072 


0.0060 


0.0256 


0.0120 


0.0268 


0.0096 


0.0256 


0.0084 


-3 


0.0044 


0.0134 


0.0180 


0.0140 


0.0234 


0.0180 


0.0252 


0.0244 


0.0234 


0.0126 


-5 


0.0072 


0.0017 


0.0063 


0.0045 


0.0167 


0.0090 


0.0026 


0.0172 


0.0217 


0.0223 
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Table 12.3 

Exercise 12.22 (Solution on p. 379.) 

(See Exercise 9 (Exercise 8.9) from "Problems On Random Vectors and Joint Distributions", and 
Exercise 19 (Exercise 11.19) from "Problems on Mathematical Expectation"). Data were kept on 
the effect of training time on the time to perform a job on a production line. X is the amount of 
training, in hours, and Y is the time to perform the task, in minutes. The data are as follows (in 
file npr08_09.m (Section 17.8.40: npr08_09)): 



P(X = t, Y = u) 



(12.117) 



t = 


1 


1.5 


2 


2.5 


3 


u = 5 


0.039 


0.011 


0.005 


0.001 


0.001 


4 


0.065 


0.070 


0.050 


0.015 


0.010 


3 


0.031 


0.061 


0.137 


0.051 


0.033 


2 


0.012 


0.049 


0.163 


0.058 


0.039 


1 


0.003 


0.009 


0.045 


0.025 


0.017 



Table 12.4 



For the joint densities in Exercises 23-30 below 

a. Determine analytically Var [X], Cov [X, Y], and the regression line of Y on X. 

b. Check these with a discrete approximation. 



Exercise 12.23 (Solution on p. 379.) 

(See Exercise 10 (Exercise 8.10) from "Problems On Random Vectors and Joint Distributions", 
and Exercise 20 (Exercise 11.20) from "Problems on Mathematical Expectation"), fxy {t,u) = 1 
for < t < 1, < u < 2(1 - t). 



E[X\= 1 -, £[X 2 ]=i, E[Y] 



(12.118) 



Exercise 12.24 (Solution on p. 379.) 

(See Exercise 13 (Exercise 8.13) from "Problems On Random Vectors and Joint Distributions", and 
Exercise 23 (Exercise 11.23) from "Problems on Mathematical Expectation"), fxy {t, u) = | (t + u) 
for < t < 2, < u < 2. 



E[X] = E[Y] 



6' L J 3 



(12.119) 



Exercise 12.25 (Solution on p. 380.) 

(See Exercise 15 (Exercise 8.15) from "Problems On Random Vectors and Joint Distributions", 
and Exercise 25 (Exercise 11.25) from "Problems on Mathematical Expectation"), fxy (t,u) = 
^ (2i + 3m 2 ) for < t < 2, < u < 1 + t. 



E[X] 



313 

220' 



E[Y] 



1429 
880 



, E[X*] 



49 
22 



(12.120) 
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Exercise 12.26 (Solution on p. 380.) 

(See Exercise 16 (Exercise 8.16) from "Problems On Random Vectors and Joint Distributions", and 
Exercise 26 (Exercise 11.26) from "Problems on Mathematical Expectation"), fxy {t,u) = I2t 2 u 
on the parallelogram with vertices 

(-1,0), (0,0), (1,1), (0,1) (12.121) 

E[X] = \, E[Y} = ^, E[X 2 } = 2 1 (12.122) 

Exercise 12.27 (Solution on p. 380.) 

(See Exercise 17 (Exercise 8.17) from "Problems On Random Vectors and Joint Distributions", and 
Exercise 27 (Exercise 11.27) from "Problems on Mathematical Expectation"). fxY {t,u) = jjtu 
for < t < 2, < u < min{l,2 - t}. 

»w-s > Elr] ' 3 i- E W = e §k < 12I23 > 

Exercise 12.28 (Solution on p. 380.) 

(See Exercise 18 (Exercise 8.18) from "Problems On Random Vectors and Joint Distributions", 
and Exercise 28 (Exercise 11.28) from "Problems on Mathematical Expectation"), fxy {t,u) = 
^ (t + 2m) for < t < 2, < u < max{2 - t, t). 

E[X]=—, E[Y] = —, E\X 2 } = — (12.124) 

1 J 46 l J 23 L J 5290 v ' 

Exercise 12.29 (Solution on p. 381.) 

(See Exercise 21 (Exercise 8.21) from "Problems On Random Vectors and Joint Distributions", 
and Exercise 31 (Exercise 11.31) from "Problems on Mathematical Expectation"). fxY{t,u) = 
Ys(t + 2m), for < t < 2, < u < min{2t, 3 - t}. 

E[X]=—, E[Y] = —, E\X 2 ] = — (12.125) 

L J 13 L J 12' L J 1690 v ' 

Exercise 12.30 (Solution on p. 381.) 

(See Exercise 22 (Exercise 8.22) from "Problems On Random Vectors and Joint Distributions", 
and Exercise 32 (Exercise 11.32) from "Problems on Mathematical Expectation"). fxY{t,u) = 
I[o,i] {t) | (t 2 + 2m) + J (li2] (t) ^t 2 u 2 , for < m < 1. 

E[X]=— , E[Y] = — , E\X 2 } = — (12.126) 

l J 224 ; I J 16 . L J 7 q V ) 

Exercise 12.31 (Solution on p. 381.) 

The class {X, Y, Z} of random variables is iid (independent, identically distributed) with common 
distribution 

X=[-5-1347] PX = 0.01 * [15 20 30 25 10] (12.127) 

Let W = 3X — AY + 2Z. Determine E [W] and Var [W]. Do this using icalc, then repeat with 
icalc3 and compare results. 
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Exercise 12.32 (Solution on p. 382.) 

f XY (t,u) = ^ (2t + 3u 2 ) for < t < 2, < u < 1 + t (see Exercise 25 (Exercise 11.25) and 
Exercise 37 (Exercise 11.37) from "Problems on Mathematical Expectation"). 

Z = l m (X) AX + J (li2] (X) (X + Y) (12.128) 

E[X\ = ™, E[Z\=™», E[Z>]= 488 ± (12.129) 

1 J 220 l J 1760' L J 440 y ' 

Determine Var [Z] and Cov [X,Z]. Check with discrete approximation. 

Exercise 12.33 (Solution on p. 382.) 

fxr(t,u) = jjtu for < t < 2, < u < min{l,2 - t} (see Exercise 27 (Exercise 11.27) and 
Exercise 38 (Exercise 11.38) from "Problems on Mathematical Expectation"). 

Z = I M {X,Y)\x + I M c{X,Y)Y 2 , M = {(t,u):u>t} (12.130) 

E[X]=-, E[Z} = —, E\Z 2 } = — (12.131) 

L J 55' L J 55 L J 308 v ' 

Determine Var [Z] and Cov [X, Z\. Check with discrete approximation. 

Exercise 12.34 (Solution on p. 383.) 

fxv (t, u) = ^ {t + 2u) for < t < 2, < u < max{2 - t, t} (see Exercise 28 (Exercise 11.28) and 
Exercise 39 (Exercise 11.39) from "Problems on Mathematical Expectation"). 

Z = I M (X,Y)(X + Y) + I M c(X,Y)2Y, M = {(t,u) : max(t,u) < 1} (12.132) 

Determine Var [Z] and Cov [Z]. Check with discrete approximation. 

Exercise 12.35 (Solution on p. 383.) 

f XY (t,u) = jy§ ( 3 * 2 + w), for < t < 2, < u < min{2,3 - t} (see Exercise 29 (Exercise 11.29) 
and Exercise 40 (Exercise 11.40) from "Problems on Mathematical Expectation"). 

Z = I M {X,Y){X + Y) + I MI :{X,Y)2Y 2 , M = {(t,u) : t < 1, u > 1} (12.134) 

FIX]- 2313 F\7]- U22 F [721 28296 

E [X] ~ 1790' E [Z] ~ ^95"' E [Z J " ^265" (12 " 135) 

Determine Var [Z] and Cov [X,Z]. Check with discrete approximation. 

Exercise 12.36 (Solution on p. 383.) 

f XY (t, u) = ^ (St + 2tu), for < t < 2, < u < min{\ + t, 2} (see Exercise 30 (Exercise 11.30) 
and Exercise 41 (Exercise 11.41) from "Problems on Mathematical Expectation"). 

Z = I M (X,Y)X + I M °(X,Y)XY, M = {(t,u) : u < min (1,2- t)} (12.136) 

, , 1567 , , 5774 r 2n 56673 

EX} = , EZ} = , E \Z 2 = 12.137 

L J 1135 L J 3405' L J 15890 v ' 

Determine Var [Z] and Cov [X,Z]. Check with discrete approximation. 
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Exercise 12.37 (Solution on p. 384.) 

(See Exercise 12.20, and Exercises 9 (Exercise 10.9) and 10 (Exercise 10.10) from "Problems on 
Functions of Random Variables"). For the pair {X, Y} in Exercise 12.20, let 

Z = g(X,Y) = 3X 2 + 2XY -Y 2 (12.138) 

X for X + Y < 4 
W=h(X,Y) = { =I M {X,Y)X + I M o {X, Y) 2Y (12.139) 

2Y for X + Y > 4 

Determine the joint distribution for the pair {Z, W} and determine the regression line of W on Z. 
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Solutions to Exercises in Chapter 12 

Solution to Exercise 12.1 (p. 366) 

npr07_01 (Sect ion" 17. 8 . 30: npr07_01) 
Data are in T and pc 
EX = T*pc' 
EX = 2.7000 
VX = (T.~2)*pc' - EX~2 
VX = 1.5500 

[X,PX] = csort(T,pc); '/. Alternate 

Ex = X*PX' 
Ex = 2.7000 
Vx = (X.~2)*PX' - EX~2 
Vx = 1.5500 

Solution to Exercise 12.2 (p. 366) 

npr07_02 (Section~17. 8 . 31 : npr07_02) 
Data are in T, pc 
EX = T*pc'; 

VX = (T.~2)*pc' - EX~2 
VX = 2.8525 

Solution to Exercise 12.3 (p. 366) 

npr06_12 (Section~17. 8 . 28: npr06_12) 
Minterm probabilities in pm, coefficients in c 
canonic 

Enter row vector of coefficients c 

Enter row vector of minterm probabilities pm 
Use row matrices X and PX for calculations 
Call for XDBN to view the distribution 
VX = (X."2)*PX' - (X*PX')~2 
VX = 0.7309 

Solution to Exercise 12.4 (p. 366) 

X ~ binomial (127,0.0083). Var [X] = 127 • 0.0083 • (1 - 0.0083) = 1.0454. 
Solution to Exercise 12.5 (p. 367) 

X ~ binomial (20,1/2). Var [X] = 20 • (1/2) 2 = 5. 
Solution to Exercise 12.6 (p. 367) 

TV ~ binomial (50,0.2). Var [N] = 50 • 0.2 • 0.8 = 8. Var [X] = 30 2 Var [N] = 7200. 
Solution to Exercise 12.7 (p. 367) 

X ~ Poisson (7). Var [X] = \x = 7. 
Solution to Exercise 12.8 (p. 367) 

T ~ gamma (20,0.0002). Var [T] = 20/0.0002 2 = 500,000,000. 
Solution to Exercise 12.9 (p. 367) 

ex = [6 13 -8 0] ; 
cy =[-3 4 1 -7] ; 
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px = 0.01* [43 53 46 100]; 

py = 0.01* [37 45 39 100]; 

EX = dot(cx,px) 

EX = 5.7900 

EY = dot(cy,py) 

EY = -5.9200 

VX = sum(cx.~2.*px.*(l-px)) 

VX = 66.8191 

VY = sum(cy.~2.*py.*(l-py)) 

VY = 6.2958 

EZ = 3*EY - 2*EX 

EZ = -29.3400 

VZ = 9*VY + 4*VX 

VZ = 323.9386 

Solution to Exercise 12.10 (p. 367) 

nprl2_10 (Section~17.8.43: nprl2_10) 

Data are in ex, cy, pmx and pmy 

canonic 
Enter row vector of coefficients ex 
Enter row vector of minterm probabilities pmx 

Use row matrices X and PX for calculations 

Call for XDBN to view the distribution 

EX = dot(X.PX) 

EX = -1.2200 

VX = dot(X.~2,PX) - EX~2 

VX = 18.0253 

G = 2*X.~2 - 3*X + 2; 

[W,PW] = csort(G.PX); 

EW = dot(W.PW) 

EW = 44.6874 

VW = dot(W.~2,PW) - EW~2 

VW = 2.8659e+03 

Solution to Exercise 12.11 (p. 367) 

(Continuation of Exercise 12.10) 

[Y,PY] = canonicf (cy,pmy) ; 
EY = dot(Y,PY) 
EY = 19.2000 
VY = dot(Y.~2,PY) - EY~2 
VY = 178.3600 



icalc 








Enter 


row matrix of X- 


-values 


X 


Enter 


row matrix of Y- 


-values 


Y 


Enter 


X probabilities 


PX 




Enter 


Y probabilities 


PY 





Use array operations on matrices X, Y, PX, PY, t, u, and P 
H = t.~2 + 2*t.*u - u; 
[Z,PZ] = csort(H,P); 
EZ = dot(Z,PZ) 
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EZ = -46.5343 

VZ = dot(Z.~2,PZ) - EZ~2 

VZ = 3.7165e+04 

Solution to Exercise 12.12 (p. 368) 

X ~ gamma (3, 0.1) implies E [X] = 30 and Var [X] = 300. Y ~ Poisson (13) implies E [Y] = Var [Y] = 13. 
Then 

E[Z] = 2-30-5- 13 = -5, Var [Z] = 4 • 300 + 25 • 13 = 1525 (12.140) 

Solution to Exercise 12.13 (p. 368) 

EX = 3; 
EY = 4; 
EXY = 15; 
EX2 = 11; 
VY = 5; 

VX = EX2 - EX~2 
VX = 2 

CV = EXY - EX+EY 
CV = 3 

VZ = 9+VX + 4+VY - 6*2*CV 
VZ = 2 

Solution to Exercise 12.14 (p. 368) 

px = 0.01* [47 33 46 100] ; 
py = 0.01* [27 41 37 100]; 
ex = [8 11 -7 0] ; 
cy =[-3 5 1 -3] ; 
ex = dot(cx,px) 
ex = 4.1700 
ey = dot(cy.py) 
ey = -1.3900 

vx = sum(cx. "2 . *px. * (1 - px)) 
vx = 54.8671 

vy = sum(cy . "2 . *py . * (1-py) ) 
vy = 8.0545 

[X,PX] = canonicf (cx,minprob(px(l : 3) ) ) ; 
[Y,PY] = canonicf (cy ,minprob(py (1 : 3) )) ; 
icalc 

Enter row matrix of X-values X 
Enter row matrix of Y-values Y 
Enter X probabilities PX 
Enter Y probabilities PY 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
EX = dot(X.PX) 
EX = 4.1700 
EY = dot(Y,PY) 
EY = -1.3900 
VX = dot(X.~2,PX) - EX~2 
VX = 54.8671 
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VY = dot(Y.~2,PY) - EY~2 

VY = 8.0545 

EZ = 3+EY - 2*EX 

EZ = -12.5100 

VZ = 9*VY + 4*VX 

VZ = 291.9589 

Solution to Exercise 12.15 (p. 368) 

E [x»] = r , (r +f\ f 1 t^-\i - tr 1 m = r / r +;\ . r( ; +n)r(j ; ) = (12.141) 

L J T(r)T(s)J V ' Y{r)T{s) Y{r + s + n) v ; 

T (r + n)T (r + s) 

(12.142) 



r (r + s + n) r (r) 
Using r (x + 1) = xT (x) we have 



S [X] = -^—, E \X 2 } = r ^ + 1} (12.143) 

L J r + s L J {r + s){r + s + l) V ' 



Some algebraic manipulations show that 



Var [X] = E [X 2 ] - E 2 [X] = -^ (12.144) 

(r + s) (r + s+lj 



Solution to Exercise 12.16 (p. 368) 

EX = 3; 
EX2 = 11; 
EY = 10; 
EY2 = 101; 
EXY = 30; 
VX = EX2 - EX~2 
VX = 2 

VY = EY2 - EY"2 
VY = 1 

CV = EXY - EX+EY 
CV = 

VZ = 15-2+VX + 2~2*VY 
VZ = 454 

Solution to Exercise 12.17 (p. 368) 

EX = 2; 
EX2 = 5; 
EY = 1; 
EY2 = 2; 
EXY = 1; 
VX = EX2 - EX~2 
VX = 1 

VY = EY2 - EY~2 
VY = 1 

CV = EXY - EX+EY 
CV = -1 
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VZ = 9*VX + 4+VY + 2*6*CV 
VZ = 1 

Solution to Exercise 12.18 (p. 368) 

EX = 2; 
EY = 1; 
VX = 6; 
VY = 4; 

EX2 = VX + EX~2 
EX2 = 10 
EY2 = VY + EY~2 
EY2 = 5 

EZ = 2+EX2 + EX+EY2 - 3+EY + 4 
EZ = 31 

Solution to Exercise 12.19 (p. 369) 

E [X 2 ] =Jt 2 fx (t) dt=lj\ i dt+ 6 - f (2i 2 - t 3 ) dt = g (12.145) 

V a r[X] = E[X 2 }-E 2 \X] = ^ (12.146) 

Solution to Exercise 12.20 (p. 369) 

npr08_07 (Sect ion" 17. 8 . 38: npr08_07) 
Data are in X, Y, P 
jcalc 



EX = dot(X.PX); 

EY = dot(Y.PY); 

VX = dot(X.~2,PX) - EX~2 

VX = 5.1116 

CV = total (t.*u.*P) - EX+EY 

CV = 2.6963 

a = CV/VX 

a = 0.5275 

b = EY - a*EX 

b = 0.6924 '/, Regression line: u = at + b 

Solution to Exercise 12.21 (p. 369) 



npr08_08 (Section~17. 8 . 39: npr08_08) 
Data are in X, Y, P 
jcalc 



EX = dot(X.PX); 

EY = dot(Y,PY); 

VX = dot(X.~2,PX) - EX~2 

VX = 31.0700 

CV = total (t.*u.*P) - EX+EY 
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CV = -8.0272 

a = CV/VX 

a = -0.2584 

b = EY - a*EX 

b = 5.6110 '/, Regression line: u = at + b 

Solution to Exercise 12.22 (p. 370) 



npr08_09 (Section~17. 8 .40: npr08_09) 
Data are in X, Y, P 
jcalc 



EX = dot(X.PX); 






EY = dot(Y.PY); 






VX = dot(X.~2,PX) - 


EX~2 




VX = 0.3319 






CV = total (t.*u.*P) 


- EX+EY 




CV = -0.2586 






a = CV/VX 






a = -0.77937/6; 






b = EY - a*EX 






b = 4.3051 


'/, Regression line: 


u = at + b 


Solution to Exercise 12.23 (p. 370) 








,1 i-2(l-i) 




E [XY] = 


/ / tududt= 1/6 

Jo Jo 


Cov[X,Y] = l-\.l = 

6 3 3 


= -1/18 Var [X] = 1/6- 



(12.147) 
3) 2 = 1/18 (12.148) 

a = Cov [X,Y]/Var [X] = -1 b = E [Y] - aE [X] = 1 (12.149) 



tuappr: [0 1] [0 2] 200 400 u<=2*(l-t) 
EX = dot(X.PX); 
EY = dot(Y.PY); 
VX = dot(X.~2,PX) - EX~2 
VX = 0.0556 

CV = total (t.*u.*P) - EX+EY 
CV = -0.0556 
a = CV/VX 
a = -1.0000 
b = EY - a*EX 
b = 1.0000 



Solution to Exercise 12.24 (p. 370) 

E [XY] = - / tu(t + u) dudt = 4/3, Cov [X, Y] = -1/36, Var [X] = 11/36 (12.150) 

8 Jo Jo 

a = Cov [X,y]/Var [X] = -1/11, b = E [Y] - aE [X] = 14/11 (12.151) 
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tuappr: [0 2] [0 2] 200 200 (1/8)* (t+u) 
VX = 0.3055 CV = -0.0278 a = -0.0909 b = 1.2727 

Solution to Exercise 12.25 (p. 370) 

E [XY] = — f 2 f 1+t tu (2t + 3u 2 ) dudt = — Cov [X, Y] = 26383 , Var [X] = -^- (12.152) 
L J 88 J J y ' 880 L ' J 1933600' L J 48400 V ; 

a = Cov [X, Y] /Var [X] = ^^ b = E \Y] - aE [X] = ^^ (12.153) 

1 ' J/ l J 39324 L J l J 39324 v ' 

tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t + 3*u. ~2) . *(u<=l+t) 
VX = 0.2036 CV = 0.1364 a = 0.6700 b = 0.6736 

Solution to Exercise 12.26 (p. 371) 

r° r t+1 f 1 f 1 2 

E [XY] = 12 I / t 3 u 2 dudt + 12 / / t 3 u 2 dudt = - (12.154) 

J-i Jo Jo Jt 5 

Cov[X,y] = ^, Var[X]=^ (12.155) 

a = Cov \X, Y] /Var \X] = 4/9 b = E \Y] - aE \X] = 5/9 (12.156) 



tuappr: [-1 1] [0 1] 400 200 12*t . "2 . *u. * (u>= max(0,t)) . * (u<= min(l+t , 1) ) 
VX = 0.2383 CV = 0.1056 a = 0.4432 b = 0.5553 

Solution to Exercise 12.27 (p. 371) 

, 24 f 1 f 1 , , 24 f 2 f 2 -* , , 28 , 

E [XY] = — / t 2 u 2 dudt + — / t 2 u 2 dudt = — 12.157 

11 Jo Jo n J\ Jo 55 

124 431 

Cov [XY] = , Var [X] = —— (12.158) 

1 J 3025 l J 3025 v ; 

a = Cov [X, Y] /Var [X] = ~ b = E [Y] - aE [X] = ^ (12.159) 

tuappr: [0 2] [0 1] 400 200 (24/11) *t . *u. * (u<=min(l,2-t)) 
VX = 0.1425 CV =-0.0409 a = -0.2867 b = 0.8535 

Solution to Exercise 12.28 (p. 371) 

E [XY] = — / tu{t+2u) dudt + — / tu(t + 2u) dudt = (12.160) 

23 Jg Jo 23 Ji Jo 230 

Cov [X,Y] = — , Var [X] = ^- (12.161) 

[ J 5290 L J 10580 v ; 

a = Cov [X, Y] /Var [X] = -^ b = E [Y] - aE [X] = ^ (12.162) 
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tuappr: [0 2] [0 2] 200 200 (3/23) *(t + 2*u) . * (u<=max(2-t ,t)) 
VX = 0.3984 CV = -0.0108 a = -0.0272 b = 0.9909 

Solution to Exercise 12.29 (p. 371) 

2 r 1 r 3 ~ f 2 C 2 C 2t 431 

E [XY] = — / tu(t + 2u) dudt H / / tu (t + 2u) dudt = (12.163) 

13 Jo Jo 13 Vi Jo 390 

o 9&7 

Cov [X,Y] = — Var [X] = -— (12.164) 

1 J 130 [ J 1690 v ' 

39 3733 

a = Cov [X, Y] /Var [X] = - — b = E[Y]-aE [X] = ^^ (12.165) 

tuappr: [0 2] [0 2] 400 400 (2/13)*(t + 2*u) . * (u<=min(2*t ,3-t) ) 
VX = 0.1698 CV = -0.0229 a = -0.1350 b = 1.0839 

Solution to Exercise 12.30 (p. 371) 

3 Z* 1 r^ 9 /* 2 Z* 1 347 

E [XY] = - tu(t 2 + 2u) dudt + — / / t 3 u 3 dudt = — - (12.166) 

8 Jo Jo 14 J 1 j 448 

Cav[X,Y] = ^-, Var[X]=^^ (12.167) 

L J 3584 L J 250880 v ; 

a = Cov [X, Y] /Var [X] = ^- b=E\Y]-aE [X] = 105691 (12.168) 

L ' J/ l J 88243 L J L J 176486 v ' 

tuappr: [0 2] [0 1] 400 200 (3/8)*(t.~2 + 2*u).*(t<=l) + (9/14) *t . ~2 . *u. "2. * (t>l) 
VX = 0.3517 CV = 0.0287 a = 0.0817 b = 0.5989 

Solution to Exercise 12.31 (p. 371) 

x = [-5-13 4 7]; 
px = 0.01* [15 20 30 25 10] ; 

EX = dot(x,px) '/, Use of properties 

EX = 1.6500 
VX = dot(x.~2,px) - EX~2 
VX = 12.8275 
EW = (3 - 4+ 2)*EX 
EW = 1.6500 
VW = (3-2 + 4-2 + 2-2) *VX 
VW = 371.9975 

icalc °/. Iterated use of icalc 

Enter row matrix of X-values x 
Enter row matrix of Y-values x 
Enter X probabilities px 
Enter Y probabilities px 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
G = 3*t - 4*u; 
[R,PR] = csort(G,P) ; 
icalc 
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Enter row matrix of X-values R 
Enter row matrix of Y-values x 
Enter X probabilities PR 
Enter Y probabilities px 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
H = t + 2*u; 
[W,PW] = csort(H,P); 
EW = dot(W.PW) 
EW = 1.6500 
VW = dot(W.~2,PW) - EW~2 
VW = 371.9975 

icalc3 '/, Use of icalc3 

Enter row matrix of X-values x 
Enter row matrix of Y-values x 
Enter row matrix of Z-values x 
Enter X probabilities px 
Enter Y probabilities px 
Enter Z probabilities px 
Use array operations on matrices X, Y, Z, 
PX, PY, PZ, t, u, v, and P 
S = 3*t - 4*u + 2*v; 
[w,pw] = csort(S,P); 
Ew = dot(w,pw) 
Ew = 1.6500 
Vw = dot(w.~2,pw) - Ew~2 
Vw = 371.9975 

Solution to Exercise 12.32 (p. 372) 

o /-l rl+t o /-2 i-l+t 16931 

E[XZ} = — / 4£ 2 (2* + 3m 2 ) dudt H / / t (t + u) (2t + 3w 2 ) dudt = (12.169) 

88 J 7 88 J 1 J 3520 

Var [Z] = E \Z 2 } - E 2 [Z] = 2451 ° 39 Cov [X, Z] = E [XZ] -E[X]E [Z] = -^^- (12.170) 

L J l J i J 30976OO L J L J l j l j 3 87 2oo v ' 

tuappr: [0 2] [0 3] 200 300 (3/88)* (2*t+3*u. ~2) . * (u<=l+t) 
G = 4*t.*(t<=l) + (t+u) .*(t>l) ; 
EZ = total(G.*P) 
EZ = 3.2110 
EX = dot(X.PX) 
EX = 1.4220 

CV = total(G.*t.*P) - EX+EZ 

CV = 0.2445 '/. Theoretical 0.2435 

VZ = total(G."2.*P) - EZ~2 
VZ = 0.7934 '/. Theoretical 0.7913 

Solution to Exercise 12.33 (p. 372) 

24 I' 1 I' 1 24 I' 1 /"* 24 I' 2 l' 2 ~ l 211 

E[XZ} = — / t(t/2)tududt+ — / tu 2 tududt+— / ttu 2 tu dudt = -— (12.171) 

11 Jo Jt 11 Jo Jo 11 Ji Jo ' '0 

3557 43 

Var [Z] = E \Z 2 ] - E 2 [Z] = -^- Cov [Z, X] = E [XZ] -E\X]E [Z] = — (12.172) 

L j L J L J 84700 l ' J l J i J L J 42350 v > 
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tuappr: [0 2] [0 1] 400 200 (24/11) *t . *u. * (u<=min(l ,2-t) ) 
G = (t/2) .*(u>t) + u.~2.*(u<=t) ; 
VZ = total(G."2.*P) - EZ~2 
VZ = 0.0425 

CV = total (t.*G.*P) - EZ*dot(X,PX) 
CV = -9.2940e-04 

Solution to Exercise 12.34 (p. 372) 

E [ZX] = — / t(t+u){t + 2u) dudt + — / / 2tu(t+ 2u) dudt + (12.173) 

23 J n J 23 J J 1 

3 f 2 /"* , , 1009 



2tu (t + 2m) dudt = — - (12.174) 

Var [Z] = £ [Z 2 1 - E 2 [Z] = ^^- Cov [Z, X] = E [ZX] -E\Z]E [X] = -^— (12.175) 

L j l J L J 42320 L ' J L J l J l J 21160 v ' 

tuappr: [0 2] [0 2] 400 400 (3/23) * (t+2*u) . * (u<=max(2-t ,t) ) 
M = max(t,u)<=l; 
G = (t+u).*M + 2*u.*(l-M); 
EZ = total(G.*P) ; 
EX = dot(X.PX); 
CV = total (t.*G.*P) - EX+EZ 
CV = 0.0017 

Solution to Exercise 12.35 (p. 372) 

12 f 1 f 2 12 f 1 f 1 

E [ZX] = — / / t{t + u) (3t 2 + u) dudt + — - / / 2tu 2 (3t 2 + u) dudt+ (12.176) 

179 Jo J i 179 J J 

12 f 2 f 3 ^ , . , N 24029 

/ / 2tu 2 (3t 2 + u) dudt = 12.177 

179 A J v ; 12530 v ; 

Var [Z] = E \Z 2 ] - E 2 [Z] = U170SS2 C ov [Z, X] = E [ZX] - E[Z]E [X] = - 1517M7 (12.178) 

L J L J l J 5607175 l ' J l J L J L J 11214350 v ' 

tuappr: [0 2] [0 2] 400 400 (12/179) * (3*t . ~2 + u) . * (u <= min(2,3-t)) 
M = (t<=l)&(u>=l); 
G = (t + u).*M + 2*u.~2.*(l - M) ; 
EZ = total(G.*P) ; 
EX = dot(X.PX); 
CV = total (t.*G.*P) - EZ+EX 
CV = -0.1347 

Solution to Exercise 12.36 (p. 372) 

12 f' 1 I' 1 12 f' 2 f' 2 ^* 

E [ZX] = / t 2 {3t + 2tu) dudt + / / t 2 {3t + 2tu) d 

227 Jo Jo 227 Ji j 

12 f 1 f 1+t 12 f 2 f 2 

/ / t 2 u (3t + 2tu) dudt H / / t 2 u {3t + 2tu) dudt = 

227 Jo J\ 227 J 1 J2-t 

Var [Z] = E [Z 2 ] - E 2 [Z] = ^|^| Cov [Z, X] = E [ZX] - E[Z]E [X] 



ludt + 


(12.179) 


20338 
7945 


(12.180) 


5915884 
27052725 


(12.181) 
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tuappr: [0 2] [0 2] 400 400 (12/227) * (3*t + 2*t.*u).*(u <= min(l+t,2)) 
EX = dot(X.PX); 
M = u <= min(l,2-t) ; 
G = t.*M + t.*u.*(l - M) ; 
EZ = total(G.*P) ; 
EZX = total (t.*G.*P) 
EZX = 2.5597 
CV = EZX - EX+EZ 
CV = 0.2188 

VZ = total(G.~2.*P) - EZ~2 
VZ = 0.6907 

Solution to Exercise 12.37 (p. 373) 

npr08_07 (Section~17. 8 . 38: npr08_07) 
Data are in X, Y, P 
jointzw 

Enter joint prob for (X,Y) P 
Enter values for X X 
Enter values for Y Y 

Enter expression for g(t,u) 3*t.~2 + 2*t . *u - u.~2 
Enter expression for h(t,u) t.*(t+u<=4) + 2*u.*(t+u>4) 
Use array operations on Z, W, PZ, PW, v, w, PZW 
EZ = dot(Z,PZ) 
EZ = 5.2975 
EW = dot(W.PW) 
EW = 4.7379 
VZ = dot(Z.~2,PZ) - EZ~2 
VZ = 1.0588e+03 
CZW = total(v.*w.*PZW) - EZ+EW 
CZW = -12.1697 
a = CZW/VZ 
a = -0.0115 
b = EW - a*EZ 
b = 4.7988 '/, Regression line: w = av + b 



Chapter 13 

Transform Methods 

13.1 Transform Methods 1 

As pointed out in the units on Expectation (Section 11.1) and Variance (Section 12.1), the mathematical 
expectation E [X] = fix of a random variable X locates the center of mass for the induced distribution, and 
the expectation 

E [g (X)} = E[(X-E [X}f] = Var [X] = a\ (13.1) 

measures the spread of the distribution about its center of mass. These quantities are also known, respec- 
tively, as the mean (moment) of X and the second moment of X about the mean. Other moments give added 

information. For example, the third moment about the mean E \(X — fxx) gives information about the 
skew, or asymetry, of the distribution about the mean. We investigate further along these lines by exam- 
ining the expectation of certain functions of X. Each of these functions involves a parameter, in a manner 
that completely determines the distribution. For reasons noted below, we refer to these as transforms. We 
consider three of the most useful of these. 

13.1.1 Three basic transforms 

We define each of three transforms, determine some key properties, and use them to study various proba- 
bility distributions associated with random variables. In the section on integral transforms (Section 13.1.2: 
Integral transforms), we show their relationship to well known integral transforms. These have been studied 
extensively and used in many other applications, which makes it possible to utilize the considerable literature 
on these transforms. 

Definition. The moment generating functionMx for random variable X (i.e., for its distribution) is the 
function 

Afx (s) = E [e sX ] (s is a real or complex parameter) (13-2) 

The characteristic functionMx for random variable X is 

(fix (u) = E [e tuX ] (i 2 = — 1, u is a real parameter) (13.3) 
The generating functiongx (s) for a nonnegative, integer- valued random variable X is 

gx(s) = E[s x }=Y<s k P(X = k) (13.4) 



1 This content is available online at <http://cnx.Org/content/m23473/l.7/>. 
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The generating function E [s x ] has meaning for more general random variables, but its usefulness is greatest 
for nonnegative, integer-valued variables, and we limit our consideration to that case. 

The defining expressions display similarities which show useful relationships. We note two which are 
particularly useful. 

M x (s) = E [e sX ] = E [(e s f] = g x (e s ) and <f> x («) = E [e mX ] = M x (iu) (13.5) 

Because of the latter relationship, we ordinarily use the moment generating function instead of the char- 
acteristic function to avoid writing the complex unit i. When desirable, we convert easily by the change of 
variable. 

The integral transform character of these entities implies that there is essentially a one-to-one relationship 
between the transform and the distribution. 
Moments 

The name and some of the importance of the moment generating function arise from the fact that the 
derivatives of M x evaluateed at s = are the moments about the origin. Specifically 

M^ ] (0) = E [X k ] , provided the fcth moment exists (13.6) 

Since expectation is an integral and because of the regularity of the integrand, we may differentiate inside 
the integral with respect to the parameter. 



M X (s) = *-E [e° x ] = E 



-e sX 
ds 



E [Xe sX ] (13.7) 



Upon setting s = 0, we have M x (0) = S[X]. Repeated differentiation gives the general result. The 
corresponding result for the characteristic function is </>( fe ) (0) = i k E [X k ~\. 

Example 13.1: The exponential distribution 

The density function is f x (t) = Ae _A * for t > 0. 

/•°o \ 

M x (s) = E [e sX ] = / Ae-( A - S )* dt = (13.8) 

Jo A - s 

A 2A 

M x (s) = - -5 M x (s) = - , (13.9) 

(A-s) (A-s) 

A 1 2A 2 

E[X] = M x (0) = ^ = - E[X 2 }=M x (0) = ^ = ^ (13.10) 

From this we obtain Var [X] = 2/A 2 - 1/A 2 = 1/A 2 . 
The generating function does not lend itself readily to computing moments, except that 

oo oo 

g x (s) = s ^ j ks k - 1 P{X = k) so that g x (l) = ^kP{X = k) = E\X] (13.11) 

fe=i fe=i 

For higher order moments, we may convert the generating function to the moment generating function by 
replacing s with ef, then work with M x and its derivatives. 

Example 13.2: The Poisson (/x) distribution 

P {X = k) 

oo £. oo ( _ \h 



e _AI ^r, k > 0, so that 




OO fc 

k=0 


oo , , k 

^ k\ 

k=0 
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We convert to Mx by replacing s with e s to get Mx (s) = e^ e ~ x \ Then 

M x (s) = e^'-^ne" M x (s) = e^ 6 "" 1 ) [/i 2 e 2s + fie s ] (13.13) 

so that 

E [X] = M x (0) =n, E [X 2 ] = M' x (0) = /i 2 + A*, and Var [X] = ^ 2 + /z - /i 2 = /i (13.14) 

These results agree, of course, with those found by direct computation with the distribution. 

Operational properties 

We refer to the following as operational properties. 

(Tl): If Z = aX + b, then 

M z (s) = e bs M x (as) , 4>z (u) = e iub <j> x (o«) , ffz (*) = Ax (s a ) (13.15) 

For the moment generating function, this pattern follows from 

E r e (aX+b H = s 6s £ [e^*] (13.16) 

Similar arguments hold for the other two. 
(T2): If the pair {X, Y} is independent, then 

M x +y («) = A^x (s) M Y (s) , 0x+y («) = ^x («) 0y («) , 5x+y (s) = 9x («) ffy («) 

(13.17) 
For the moment generating function, e sX and e sF form an independent pair for each value of the 
parameter s. By the product rule for expectation 

E [e s ( x+y )j = E [e sX e sY ] = E [e sX ] E [e sY ] (13.18) 

Similar arguments are used for the other two transforms. 
A partial converse for (T2) is as follows: 
(T3): If Mx+y ( s ) = Mx (s) My (s), then the pair {X, Y} is uncorrelated. To show this, we obtain two 

expressions for E \(X + Y) , one by direct expansion and use of linearity, and the other by taking 
the second derivative of the moment generating function. 

E \{X + Yf] = E [X 2 ] + E [Y 2 ] + 2E [XY] (13.19) 

M' x+Y (s) = [M x (s) M Y («)]" = M' x («) M Y (s) + M x («) My (s) + 2M X (*) M Y (s) (13.20) 

On setting s = and using the fact that Mx (0) = My (0) = 1, we have 

E \{X + Y) 2 ] = E [X 2 ] + E [Y 2 ] + 2E [X] E [Y] (13.21) 

which implies the equality E [XY] = E [X] E [Y]. 

Note that we have not shown that being uncorrelated implies the product rule. 

We utilize these properties in determining the moment generating and generating functions for several of 
our common distributions. 

Some discrete distributions 

1. Indicator function X = Ie P (E) = p 

gx{s) = s°q+s l p=q + ps M x (a) = gx (e s ) = q + pe s (13.22) 
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2. Simple random variableX = Yl7=i ti^M (primitive form) P [At) = pi 

n 

M x (s) = Y j e st *P l (13-23) 

»=i 

3. Binomial(n, p). X = J27=i ^ E i w ^ tri i^ E i '■ 1 < * < ^} iid P (Ei) = P 

We use the product rule for sums of independent random variables and the generating function for the 
indicator function. 

n 

9x(s) = l[(q + p S ) = (q + ps) n M x (s) = (q + pe s ) n (13.24) 

i=l 

4. Geometric(p). P (X = k) = pq k \/k > OE [X] = q/p We use the formula for the geometric series to get 

oo oo 

gx (s) = VpgV = P J2 («*)* = T^M X (s) = -*— (13.25) 

' — ' ^ — ' 1 — qs 1 — qe s 



fe=0 fc=0 



5. Negative binomial(m, p) If Y m is the number of the trial in a Bernoulli sequence on which the mth 
success occurs, and X m = Y m — m is the number of failures before the mth success, then 

P (X m = k) = P (Y m - m = k) = C (-m, k) {-q) k p m (13.26) 

„. , — to (— m — 1) (— m — 2) ■ ■ ■ (— m — k + I) 

whereC (-to, k) = — t- 1 ^-^- 13.27 

k\ 

The power series expansion about t = shows that 

(1 + t)~ m = 1 + C (-to, l)t + C (-to, 2) t 2 + ■ ■ ■ for - 1 < t < 1 (13.28) 

Hence 

oo 

M Xm ( S )=p m ^C(-TO,fc)(- 9 ) A 



,k eSk 



p 



1 — qe s 



(13.29) 



k=0 

Comparison with the moment generating function for the geometric distribution shows that X m = 
Y m — to has the same distribution as the sum of m iid random variables, each geometric (p). This 
suggests that the sequence is characterized by independent, successive waiting times to success. This 
also shows that the expectation and variance of X m are m times the expectation and variance for the 
geometric. Thus 

E [X m ] = mq/p&ndV&r [X m ] = mq/p 2 (13.30) 

k 

6. Poisson(n) P (X = k) = e _M ^-VA; > In Example 13.2 (The Poisson (/z) distribution), above, we 
establish gx (s) = e M ( s_1 ) and M x (s) = e^ e ~ x \ If {X, Y} is an independent pair, with X ~ Poisson 
(A) and Y ~ Poisson (/i), then Z = X + Y ~ Poisson (A + /i). Follows from (Tl) and product of 
exponentials. 

Some absolutely continuous distributions 

1. Uniform on (a, b)fx (t) = 53^ a < t < b 

M x (s) = f e st f x (t) dt=- i — [ e st dt = -. ~ **" (13.31) 

J b- a J a s(b- a) 
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2. Symmetric triangular (— c, , c) 



M x (s) 



fx (t) = /[_„,,» (t) ^ + / [0 ,c] (*) C ~ ' 



{c+t)e st dt+ — / (c-i)e st dt 
c Jo 



c 2 




(13.32) 


dt- 


-2 


(13.33) 


C 2 S 2 





1 1-e" 



M Y (a) M z (-«) = M y (a) M_ z (a) 



(13.34) 



ca ca 

where My is the moment generating function for Y ~ uniform (0, c) and similarly for Mz- Thus, 
X has the same distribution as the difference of two independent random variables, each uniform on 
(0, c). 

3. Exponential(X)fx (t) = Ae~ At , t > 

In example 1, above, we show that Mx (s) = jzr- 

4. Gamma(a, X)f x (t) = j^y A a i a_1 e _A ' t > 



\ a /*oo 

Mx(s)~ r-^-^-)** 

r (a) 7o 



A 



A-a 



For a = n, a positive integer, 



Afx(a) 



A-a 



(13.35) 



(13.36) 



which shows that in this case X has the distribution of the sum of n independent random variables 
each exponential (A). 
5. NormaUfj,, a 2 ). 

• The standardized normal, Z ~ N (0, 1) 



M z {s) 



'2it 



e st e~ t2 / 2 dt 



(13.37) 



Now st 



h(t — a) so that 



Mz (a) = e* 2 / 2 -L / e ^(^ s ) 2 /2 dt = eS 2 /2 

V 27T J-oo 



(13.38) 



since the integrand (including the constant 1/v27t) is the density for N (a, 1). 
• X = aZ + a, implies by property (Tl) 

2.2 



/ 2 2 \ 

M x (s) = e s "e a2s2/2 = exp ( — + su.) 



(13.39) 



Example 13.3: Afflne combination of independent normal random variables 

Suppose {X, Y} is an independent pair with X ~ N (nx, c|) and Y ~ N (hy, Cy)- Let Z = 
aX + bY + c. Then Z is normal, for by properties of expectation and variance 

fiz = a Mx + bfiy + c and cr z = a a x + b a Y (13.40) 

and by the operational properties for the moment generating function 



/ ^2^2 + ^2^2 \ g 2 N 

M z (s) = e sc M x {as) M Y (bs) = exp — h s {a^ x + bfx Y + c) 



(13.41) 
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= exp(^- + sfi z ) (13.42) 

The form of Mz shows that Z is normally distributed. 
Moment generating function and simple random variables 

Suppose X = J^r=i ti^-Ai m canonical form. That is, A; is the event {X = ti} for each of the distinct 
values in the range of X, with pi = P (Aj) = P (X = ti). Then the moment generating function for X is 



M x (s) = Y,P* eSU ( 13 - 43 ) 



The moment generating function Mx is thus related directly and simply to the distribution for random 
variable X. 

Consider the problem of determining the sum of an independent pair {X, Y} of simple random variables. 
The moment generating function for the sum is the product of the moment generating functions. Now if 
Y = YlT=i u jlBj, with P (Y = Uj) = ttj, we have 

M x (s)M Y (s) = (f>e st (X> eS " 3 ' I = Y,P^ eS(U+Ul) ( 13 - 44 ) 

The various values are sums ti + Uj of pairs (ti, Uj) of values. Each of these sums has probability piiTj 
for the values corresponding to ti, Uj. Since more than one pair sum may have the same value, we need to 
sort the values, consolidate like values and add the probabilties for like values to achieve the distribution 
for the sum. We have an m-function mgsum for achieving this directly. It produces the pair-products for 
the probabilities and the pair-sums for the values, then performs a csort operation. Although not directly 
dependent upon the moment generating function analysis, it produces the same result as that produced by 
multiplying moment generating functions. 

Example 13.4: Distribution for a sum of independent simple random variables 

Suppose the pair {X, Y} is independent with distributions 

X = [1 3 5 7] Y = [2 3 4] PX = [0.2 0.4 0.3 0.1] PY = [0.3 0.5 0.2] (13.45) 

Determine the distribution for Z = X + Y. 



x = [l 3 : 


5 7]; 


Y = 2:4; 






PX = 0.1* [2 


4 


3 1]; 


PY = 0.1* [3 


5 


2]; 


[Z,PZ] = mg! 


5um(X,Y,PX,PY); 


disp([Z;PZ] ; 


') 




3.0000 




0.0600 


4.0000 




0.1000 


5.0000 




0.1600 


6.0000 




0.2000 


7.0000 




0.1700 


8.0000 




0.1500 


9.0000 




0.0900 


10.0000 




0.0500 


11.0000 




0.0200 



391 

This could, of course, have been achieved by using icalc and csort, which has the advantage that other 
functions of X and Y may be handled. Also, since the random variables are nonnegative, integer-valued, 
the MATLAB convolution function may be used (see Example 13.7 (Sum of independent simple random 
variables)). By repeated use of the function mgsum, we may obtain the distribution for the sum of more 
than two simple random variables. The m-functions mgsum3 and mgsum4 utilize this strategy. 

The techniques for simple random variables may be used with the simple approximations to absolutely 
continuous random variables. 

Example 13.5: Difference of uniform distribution 

The moment generating functions for the uniform and the symmetric triangular show that the latter 
appears naturally as the difference of two uniformly distributed random variables. We consider X 
and Y iid, uniform on [0,1]. 

tappr 
Enter matrix [a b] of x-range end/points [0 1] 
Enter number of x approximation points 200 
Enter density as a function of t t<=l 
Use row matrices X and PX as in the simple case 
[Z,PZ] = mgsum (X, -X,PX,PX) ; 

plot(Z,PZ/d) '/. Divide by d to recover f (t) 

'/, plotting details see Figure~13.1 



Density for difference two variables, each uniform (0,1 ) 




Figure 13.1: Density for the difference of an independent pair, uniform (0,1) 



The generating function 

The form of the generating function for a nonnegative, integer- valued random variable exhibits a number 
of important properties. 



X = \~] klAi (canonical form) p^ = P (Ak) = P (X = k) gx (s) = 2~] 



s k p k 



(13.46) 



k=0 



fc=0 
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1. As a power series in s with nonnegative coefficients whose partial sums converge to one, the series 
converges at least for \s\ < 1. 

2. The coefficients of the power series display the distribution: for value k the probability p/, = P (X = fc) 
is the coefficient of s k . 

3. The power series expansion about the origin of an analytic function is unique. If the generating function 
is known in closed form, the unique power series expansion about the origin determines the distribution. 
If the power series converges to a known closed form, that form characterizes the distribution, 

4. For a simple random variable (i.e., pk = for k > n), gx is a polynomial. 

Example 13.6: The Poisson distribution 

In Example 13.2 (The Poisson (/i) distribution), above, we establish the generating function for 
X ~ Poisson (/x) from the distribution. Suppose, however, we simply encounter the generating 
function 



9x(s) 



m(«-l) 



From the known power series for the exponential, we get 



9x(s) 



£ 

fe=0 



(ms) 
fc! 



OO JU 

.m 



fc=0 



fc! 



(13.47) 



(13.48) 



We conclude that 



P{X = k) 



fc! 



0< fc 



(13.49) 



which is the Poisson distribution with parameter \x = m. 

For simple, nonnegative, integer-valued random variables, the generating functions are polynomials. Because 
of the product rule (T2) ("(T2)", p. 387), the problem of determining the distribution for the sum of 
independent random variables may be handled by the process of multiplying polynomials. This may be done 
quickly and easily with the MATLAB convolution function. 

Example 13.7: Sum of independent simple random variables 

Suppose the pair {X, Y} is independent, with 



9x(s) 



(2 + 3s + 3s 2 + 2s 5 ) g Y (s) = — (2s 



In the MATLAB function convolution, all powers of s must be accounted for by including zeros 
for the missing powers. 



(13.50) 



;x = 0.1* [2 3 3 2] ; 



PZ']; 



gy = 0.1* [o : 


2 4 4]; 


gz = conv(gx 


.gy); 


a = [' 


z 


b = [0:8;gz] 


5 . 
9 


disp(a) 




Z 


PZ 


disp(b) 










1.0000 


0.0400 


2.0000 


0.1400 


3.0000 


0.2600 


4.0000 


0.2400 


5.0000 


0.1200 



'/, Zeros for missing powers 3, 4 
'/, Zero for missing power 



'/. Distribution for Z = X + Y 
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6.0000 0.0400 

7.0000 0.0800 

8.0000 0.0800 

If mgsum were used, it would not be necessary to be concerned about missing powers and the 
corresponding zero coefficients. 



13.1.2 Integral transforms 

We consider briefly the relationship of the moment generating function and the characteristic function with 
well known integral transforms (hence the name of this chapter) . 

Moment generating function and the Laplace transform 

When we examine the integral forms of the moment generating function, we see that they represent forms 
of the Laplace transform, widely used in engineering and applied mathematics. Suppose Fx is a probability 
distribution function with Fx (— oo) = 0. The bilateral Laplace transform for Fx is given by 



e- st F x (t) dt (13.51) 

J —oo 

The Laplace-Stieltjes transform for Fx is 

/oo 
e" st F x (dt) (13.52) 

-oo 

Thus, if Mx is the moment generating function for X, then Mx (— s) is the Laplace-Stieltjes transform for 
X (or, equivalently, for Fx)- 

The theory of Laplace-Stieltjes transforms shows that under conditions sufficiently general to include all 
practical distribution functions 



M x (s) = / e~ st F x (dt) = s e' st F x (t) dt (13.53) 

J — oo J— oo 

Hence 

i r°° 

-M x (s) = / e~ st F x (t) dt (13.54) 

8 J-oo 

The right hand expression is the bilateral Laplace transform of Fx- We may use tables of Laplace transforms 
to recover Fx when Mx is known. This is particularly useful when the random variable X is nonnegative, 
so that F x (t) = for t < 0. 

If X is absolutely continuous, then 



M x (-a) = / e- st f x (t) dt (13.55) 

J — oo 

In this case, Mx (— s) is the bilateral Laplace transform of fx- For nonnegative random variable X, we may 
use ordinary tables of the Laplace transform to recover fx- 

Example 13.8: Use of Laplace transform 

Suppose nonnegative X has moment generating function 

M X (s) = ^^y (13.56) 
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We know that this is the moment generating function for the exponential (1) distribution. Now, 

-M x (-«) = — : r = - - (13.57) 

s y ' s(l + s) s 1 + s v ; 

From a table of Laplace transforms, we find 1/s is the transform for the constant 1 (for t > 0) and 
1/ (1 + s) is the transform for e - *, t > 0, so that Fx (t) = 1 — e~*£ > 0, as expected. 

Example 13.9: Laplace transform and the density 

Suppose the moment generating function for a nonnegative random variable is 



M x (s) 



A 



From a table of Laplace transforms, we find that for a > 0, 

T ( a ) :„ .i.„ t „„!„„„ .___„r r *n -I »/ 



(13.58) 



is the Laplace transform of t a_i e at t > (13.59) 



(s-a) 
If we put a = — A, we find after some algebraic manipulations 

\ a t a ~ 1 e~ xt 

fx(t)= f , t>0 (13.60) 

T(a) 

Thus, X ~ gamma (a, A), in keeping with the determination, above, of the moment generating 
function for that distribution. 

The characteristic function 

Since this function differs from the moment generating function by the interchange of parameter s and 
to, where i is the imaginary unit, i 2 = —1, the integral expressions make that change of parameter. The 
result is that Laplace transforms become Fourier transforms. The theoretical and applied literature is even 
more extensive for the characteristic function. 

Not only do we have the operational properties (Tl) ("(Tl)", p. 387) and (T2) ("(T2)", p. 387) and 
the result on moments as derivatives at the origin, but there is an important expansion for the characteristic 
function. 

An expansion theorem 

If E[\X\ n ] < oo, then 



<>( fe ) fm = i k W. \ Y k ] for n < h < r, fmrl rh (ii\ = V^ ^ %U ' 



k 



(0) = i k E [X k ] , for < k < n and (u) = }_, ^TT E [ X 1 + ° ( u ™) as u ^ ° ( 13 - 61 ) 

fe=0 

We note one limit theorem which has very important consequences. 

A fundamental limit theorem 

Suppose {F n : 1 < n) is a sequence of probability distribution functions and {<f> n : 1 < n} is the 
corresponding sequence of characteristic functions. 

1. If F is a distribution function such that F n (t) —> F (t) at every point of continuity for F, and <fi is the 
characteristic function for F, then 

(j> n {v)-*<j>{u) Vu (13.62) 

2. If (j> n (u) — > <f> (u) for all u and cf> is continuous at 0, then <fi is the characteristic function for distribution 
function F such that 

F n (t) —> F (t) at each point of continuity of F (13.63) 

— □ 
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13.2 Convergence and the central Limit Theorem 2 

13.2.1 The Central Limit Theorem 

The central limit theorem (CLT) asserts that if random variable X is the sum of a large class of independent 
random variables, each with reasonable distributions, then X is approximately normally distributed. This 
celebrated theorem has been the object of extensive theoretical research directed toward the discovery of the 
most general conditions under which it is valid. On the other hand, this theorem serves as the basis of an 
extraordinary amount of applied work. In the statistics of large samples, the sample average is a constant 
times the sum of the random variables in the sampling process . Thus, for large samples, the sample average 
is approximately normal — whether or not the population distribution is normal. In much of the theory of 
errors of measurement, the observed error is the sum of a large number of independent random quantities 
which contribute additively to the result. Similarly, in the theory of noise, the noise signal is the sum of 
a large number of random components, independently produced. In such situations, the assumption of a 
normal population distribution is frequently quite appropriate. 

We consider a form of the CLT under hypotheses which are reasonable assumptions in many practical 
situations. We sketch a proof of this version of the CLT, known as the Lindeberg-Levy theorem, which utilizes 
the limit theorem on characteristic functions, above, along with certain elementary facts from analysis. It 
illustrates the kind of argument used in more sophisticated proofs required for more general cases. 

Consider an independent sequence {X n : 1 < n} of random variables. Form the sequence of partial sums 

n n n 

S n = J2 x i V n > 1 with E [S n ] = ^E [X t ] and Var [S n ] = ^ Var [JQ] (13.64) 

i— 1 i— 1 i— 1 

Let 5* be the standardized sum and let F n be the distribution function for S*. The CLT asserts that under 
appropriate conditions, F n (t) — > $ (t) as n — > oo for all t. We sketch a proof of the theorem under the 
condition the X; form an iid class. 

Central Limit Theorem (Lindeberg-Levy form) 
If {X n : 1 < n} is iid, with 

E [Xi] = n, Var [XA = a 2 , and S* = " - " M (13.65) 

" (Ty/fl 

then 

F n (t) -»$(£) asm oo, for all t (13.66) 

IDEAS OF A PROOF 

There is no loss of generality in assuming /x = 0. Let <f> be the common characteristic function for the X;, 
and for each n let <fi n be the characteristic function for S*. We have 

<f>{t) = E [e ltx ] and 0„ (t) = E [e 4ts «] = 4> n (t/aVn) (13.67) 

Using the power series expansion of <fi about the origin noted above, we have 

<T 2 t 2 

<t>{i) = l h 13 (t) where /3(t) = o (t 2 ) as t -> (13.68) 



This implies 



so that 



\4> (t/ay/n) - (1 - t 2 /2n) | = \(3 (t/ay/n) | = o (t 2 /cr 2 n) (13.69) 

n\4> (t/ay/n) - (l - t 2 /2n) | -> as moo (13.70) 



2 This content is available online at <http://cnx.Org/content/m23475/l.5/>. 
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A standard lemma of analysis ensures 

\<j> n (t/ay/n) - (l - t 2 /2n) n \ < n\<p (t/a\/n) - (l - t 2 /2n) \ -> asB^co (13.71) 

It is a well known property of the exponential that 

t 2 \ n 

1 -» e"* 2/2 asjuco (13.72) 

2ny 

so that 

4> (t/a\/n) -> e~' 2/2 as moo for all f (13.73) 

By the convergence theorem on characteristic functions, above, F n (t) —►$(£). 

— D 

The theorem says that the distribution functions for sums of increasing numbers of the X; converge to 
the normal distribution function, but it does not tell how fast. It is instructive to consider some examples, 
which are easily worked out with the aid of our m-functions. 
Demonstration of the central limit theorem 

Discrete examples 

We first examine the gaussian approximation in two cases. We take the sum of five iid simple random 
variables in each case. The first variable has six distinct values; the second has only three. The discrete 
character of the sum is more evident in the second case. Here we use not only the gaussian approximation, 
but the gaussian approximation shifted one half unit (the so called continuity correction for integer-values 
random variables). The fit is remarkably good in either case with only five terms. 

A principal tool is the m-function diidsum (sum of discrete iid random variables). It uses a designated 
number of iterations of mgsum. 

Example 13.10: First random variable 

X = [-3.2 -1.05 2.1 4.6 5.3 7.2]; 
PX = 0.1* [2 2 1 3 1 1] ; 
EX = X*PX' 
EX = 1.9900 

VX = dot(X.~2,PX) - EX~2 
VX = 13.0904 

[x,px] = diidsum(X,PX,5) ; '/, Distribution for the sum of 5 iid rv 

F = cumsum(px) ; '/, Distribution function for the sum 

stairs (x,F) '/, Stair step plot 

hold on 

plot (x,gaussian(5*EX,5*VX,x) , ' - . ' ) '/, Plot of gaussian distribution function 
'/, Plotting details (see Figure~13.2) 
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Figure 13.2: Distribution for the sum of five iid random variables. 



Example 13.11: Second random variable 



X = 1:3; 

PX = [0.3 0.5 0.2] ; 

EX = X*PX' 

EX = 1.9000 

EX2 = X.~2*PX' 

EX2 = 4.1000 

VX = EX2 - EX~2 

VX = 0.4900 

[x,px] = diidsum(X,PX,5) ; 

F = cumsum(px) ; 

stairs (x,F) 

hold on 

plot (x,gaussian(5*EX,5*VX,x) , ' - . ' ) '/, Plot of gaussian distribution function 

plot (x,gaussian(5*EX,5*VX,x+0. 5) , 'o' ) 7, Plot with continuity correction 

7. Plotting details (see Figure~13.3) 



7. Distribution for the sum of 5 iid rv 
7. Distribution function for the sum 
'/, Stair step plot 
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Distribution for the sum of five iid random variables 
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Figure 13.3: Distribution for the sum of five iid random variables. 



As another example, we take the sum of twenty one iid simple random variables with integer values. We 
examine only part of the distribution function where most of the probability is concentrated. This effectively 
enlarges the x-scale, so that the nature of the approximation is more readily apparent. 

Example 13.12: Sum of twenty-one iid random variables 



X = [01356]; 
PX = 0.1* [1 2 3 2 2] ; 
EX = dot(X,PX) 
EX = 3.3000 

VX = dot(X.~2,PX) - EX~2 
VX = 4.2100 

[x,px] = diidsum(X,PX,21) ; 
F = cumsum(px) ; 
FG = gaussian(21*EX,21*VX,x) ; 
stairs (40 : 90, F (40: 90)) 
hold on 

plot(40:90,FG(40:90)) 
7, Plotting details 



(see Figure" 13 .4) 
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Figure 13.4: Distribution for the sum of twenty one iid random variables. 



Absolutely continuous examples 

By use of the discrete approximation, we may get approximations to the sums of absolutely continuous 
random variables. The results on discrete variables indicate that the more values the more quickly the 
conversion seems to occur. In our next example, we start with a random variable uniform on (0, 1). 

Example 13.13: Sum of three iid, uniform random variables. 

Suppose X ~ uniform (0, 1). Then E [X] = 0.5 and Var [X] = 1/12. 

tappr 
Enter matrix [a b] of x-range endpoints [0 1] 
Enter number of x approximation points 100 
Enter density as a function of t t<=l 
Use row matrices X and PX as in the simple case 
EX = 0.5; 
VX = 1/12; 

[z,pz] = diidsum(X,PX,3) ; 
F = cumsum(pz) ; 
FG = gaussian(3*EX,3*VX,z) ; 
length (z) 
ans = 298 
a = 1:5:296; 

plot (z (a) ,F(a) ,z(a) ,FG(a) , 'o') 
'/. Plotting details 



'/, Plot every fifth point 
(see Figure~13.5) 
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Distribution for the sum of three iid uniform random variables 
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Figure 13.5: Distribution for the sum of three iid uniform random variables. 



For the sum of only three random variables, the fit is remarkably good. This is not entirely surprising, since 
the sum of two gives a symmetric triangular distribution on (0, 2). Other distributions may take many more 
terms to get a good fit. Consider the following example. 

Example 13.14: Sum of eight iid random variables 

Suppose the density is one on the intervals (—1,-0.5) and (0.5,1). Although the density is sym- 
metric, it has two separate regions of probability. From symmetry, E [X] = 0. Calculations show 
Var [X] = E [X 2 ] = 7/12. The MATLAB computations are: 

tappr 
Enter matrix [a b] of x-range endpoints [-1 1] 
Enter number of x approximation points 200 
Enter density as a function of t (t<=-0 .5) I (t>=0 . 5) 
Use row matrices X and PX as in the simple case 
[z,pz] = diidsum(X,PX,8) ; 
VX = 7/12; 
F = cumsum(pz) ; 
FG = gaussian(0,8*VX,z) ; 
plot (z,F,z,FG) 
7, Plottting details (see Figure~13.6) 
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Distribution for sum of eight iid random variables 




Figure 13.6: Distribution for the sum of eight iid uniform random variables. 



Although the sum of eight random variables is used, the fit to the gaussian is not as good as that for the 
sum of three in Example 4 (Example 13.13: Sum of three iid, uniform random variables.). In either case, 
the convergence is remarkable fast — only a few terms are needed for good approximation. 



13.2.2 Convergence phenomena in probability theory 

The central limit theorem exhibits one of several kinds of convergence important in probability theory, 
namely convergence in distribution (sometimes called weak convergence). The increasing concentration of 
values of the sample average random variable A n with increasing n illustrates convergence in probability. 
The convergence of the sample average is a form of the so-called weak law of large numbers. For large enough 
n the probability that A n lies within a given distance of the population mean can be made as near one as 
desired. The fact that the variance of A n becomes small for large n illustrates convergence in the mean (of 
order 2). 



E[\A n 



Ml 



as n 



(13.74) 



In the calculus, we deal with sequences of numbers. If {<x n : 1 < n) is a sequence of real numbers, we say 
the sequence converges iff for JV sufficiently large a n approximates arbitrarily closely some number L for all 
n > N. This unique number L is called the limit of the sequence. Convergent sequences are characterized 
by the fact that for large enough JV, the distance \a n — a m \ between any two terms is arbitrarily small for 
all n, to > JV. Such a sequence is said to be fundamental (or Cauchy). To be precise, if we let e > be the 
error of approximation, then the sequence is 



• Convergent iff there exists a number L such that for any e > there is an JV such that 

\L — a„ I < £ for all n > N 



(13.75) 
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• Fundamental iff for any e > there is an JV such that 

I On ~~ a m\ < £ for all n, m > N (13.76) 

As a result of the completeness of the real numbers, it is true that any fundamental sequence converges 
(i.e., has a limit). And such convergence has certain desirable properties. For example the limit of a linear 
combination of sequences is that linear combination of the separate limits; and limits of products are the 
products of the limits. 

The notion of convergent and fundamental sequences applies to sequences of real-valued functions with 
a common domain. For each x in the domain, we have a sequence 

{fn {%) '■ 1 < n} of real numbers. The sequence may converge for some x and fail to converge for others. 

A somewhat more restrictive condition (and often a more desirable one) for sequences of functions is 
uniform convergence. Here the uniformity is over values of the argument x. In this case, for any e > there 
exists an JV which works for all x (or for some suitable prescribed set of x) . 

These concepts may be applied to a sequence of random variables, which are real- valued functions with 
domain Q, and argument u>. Suppose {X n : 1 < n} is a sequence of real random variables. For each argument 
u> we have a sequence {X n (u>) : 1 < n} of real numbers. It is quite possible that such a sequence converges 
for some u) and diverges (fails to converge) for others. As a matter of fact, in many important cases the 
sequence converges for all u> except possibly a set (event) of probability zero. In this case, we say the seqeunce 
converges almost surely (abbreviated a.s.). The notion of uniform convergence also applies. In probability 
theory we have the notion of almost uniform convergence. This is the case that the sequence converges 
uniformly for all u> except for a set of arbitrarily small probability. 

The notion of convergence in probability noted above is a quite different kind of convergence. Rather 
than deal with the sequence on a pointwise basis, it deals with the random variables as such. In the case 
of sample average, the "closeness" to a limit is expressed in terms of the probability that the observed value 
X n (uj) should lie close the the value X (uj) of the limiting random variable. We may state this precisely as 
follows: 

A sequence {X n : 1 < n} converges to Xin probability, designated X n — > X iff for any e > 0, 

UmP (\X - X n \ > e) = (13.77) 

n 

There is a corresponding notion of a sequence fundamental in probability. 

The following schematic representation may help to visualize the difference between almost-sure conver- 
gence and convergence in probability. In setting up the basic probability model, we think in terms of "balls" 
drawn from a jar or box. Instead of balls, consider for each possible outcome u> a "tape" on which there is 
the sequence of values X\ (uj) , X-i (uj) , X% (uj) , • • -. 

• If the sequence of random variable converges a.s. to a random variable X, then there is an set of 
"exceptional tapes" which has zero probability. For all other tapes, X n (u>) — > X (uj). This means 
that by going far enough out on any such tape, the values X n (uj) beyond that point all lie within a 
prescribed distance of the value X (uj) of the limit random variable. 

• If the sequence converges in probability, the situation may be quite different. A tape is selected. For 
n sufficiently large, the probability is arbitrarily near one that the observed value X n (uj) lies within a 
prescribed distance of X (uj). This says nothing about the values X m (uj) on the selected tape for any 
larger m. In fact, the sequence on the selected tape may very well diverge. 

It is not difficult to construct examples for which there is convergence in probability but pointwise convergence 
for now. It is easy to confuse these two types of convergence. The kind of convergence noted for the sample 
average is convergence in probability (a "weak" law of large numbers). What is really desired in most cases 
is a.s. convergence (a "strong" law of large numbers). It turns out that for a sampling process of the kind 
used in simple statistics, the convergence of the sample average is almost sure (i.e., the strong law holds). 
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To establish this requires much more detailed and sophisticated analysis than we are prepared to make in 
this treatment. 

The notion of mean convergence illustrated by the reduction of Var [A n ] with increasing n may be 
expressed more generally and more precisely as follows. A sequence {X n : 1 < n} converges in the mean of 
order p to X iff 

E[\X - X n \ p ] — > as moo designated X n — > X; as moo (13.78) 

If the order p is one, we simply say the sequence converges in the mean. For p = 2, we speak of mean-square 
convergence. 

The introduction of a new type of convergence raises a number of questions. 

1. There is the question of fundamental (or Cauchy) sequences and convergent sequences. 

2. Do the various types of limits have the usual properties of limits? Is the limit of a linear combination 
of sequences the linear combination of the limits? Is the limit of products the product of the limits? 

3. What conditions imply the various kinds of convergence? 

4. What is the relation between the various kinds of convergence? 

Before sketching briefly some of the relationships between convergence types, we consider one important 
condition known as uniform integrability. According to the property (E9b) (list, p. 600) for integrals 

X is integrable iff E [l{\x t \> a }\ x t\] ~* ° as a -* °° (13.79) 

Roughly speaking, to be integrable a random variable cannot be too large on too large a set. We use 
this characterization of the integrability of a single random variable to define the notion of the uniform 
integrability of a class. 

Definition. An arbitrary class {X t : t s T} is uniformly integrable (abbreviated u.i.) with respect to 
probability measure P iff 

supE [/{|jr t |>o}|-Xt|] -^Oaso^oo (13.80) 

teT 

This condition plays a key role in many aspects of theoretical probability. 

The relationships between types of convergence are important. Sometimes only one kind can be estab- 
lished. Also, it may be easier to establish one type which implies another of more immediate interest. We 
simply state informally some of the important relationships. A somewhat more detailed summary is given 
in PA, Chapter 17. But for a complete treatment it is necessary to consult more advanced treatments of 
probability and measure. 

Relationships between types of convergence for probability measures 

Consider a sequence {X n : 1 < n) of random variables. 

1. It converges almost surely iff it converges almost uniformly. 

2. If it converges almost surely, then it converges in probability. 

3. It converges in mean, order p, iff it is uniformly integrable and converges in probability. 

4. If it converges in probability, then it converges in distribution (i.e. weakly). 

Various chains of implication can be traced. For example 

• Almost sure convergence implies convergence in probability implies convergence in distribution. 

• Almost sure convergence and uniform integrability implies convergence in mean p. 

We do not develop the underlying theory. While much of it could be treated with elementary ideas, a 
complete treatment requires considerable development of the underlying measure theory. However, it is 
important to be aware of these various types of convergence, since they are frequently utilized in advanced 
treatments of applied probability and of statistics. 
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13.3 Simple Random Samples and Statistics 3 
13.3.1 Simple Random Samples and Statistics 

We formulate the notion of a (simple) random sample, which is basic to much of classical statistics. Once 
formulated, we may apply probability theory to exhibit several basic ideas of statistical analysis. 

We begin with the notion of a population distribution. A population may be most any collection of 
individuals or entities. Associated with each member is a quantity or a feature that can be assigned a 
number. The quantity varies throughout the population. The population distribution is the distribution of 
that quantity among the members of the population. 

If each member could be observed, the population distribution could be determined completely. However, 
that is not always feasible. In order to obtain information about the population distribution, we select "at 
random" a subset of the population and observe how the quantity varies over the sample. Hopefully, the 
sample distribution will give a useful approximation to the population distribution. 
The sampling process 

We take a sample of size n, which means we select n members of the population and observe the quantity 
associated with each. The selection is done in such a manner that on any trial each member is equally 
likely to be selected. Also, the sampling is done in such a way that the result of any one selection does not 
affect, and is not affected by, the others. It appears that we are describing a composite trial. We model the 
sampling process as follows: 

Let Xj, 1 < i < n be the random variable for the ith component trial. Then the class {Xi : 1 < i < n} 
is iid, with each member having the population distribution. 

This provides a model for sampling either from a very large population (often referred to as an infinite 
population) or sampling with replacement from a small population. 

The goal is to determine as much as possible about the character of the population. Two important 
parameters are the mean and the variance. We want the population mean and the population variance. 
If the sample is representative of the population, then the sample mean and the sample variance should 
approximate the population quantities. 

• The sampling process is the iid class {Xi : 1 < i < n}. 

• A random sample is an observation, or realization, (t\, t 2 , ■ ■ ■ , t n ) of the sampling process. 

The sample average and the population mean 

Consider the numerical average of the values in the sample x = — X^™=i *«■ This is an observation of the 
sample average 



n 

-Vl t =-S„ (13.81) 

n *■ — ^ r> 



n * — ' n 

»=i 



The sample sum S n and the sample average A n are random variables. If another observation were made 
(another sample taken), the observed value of these quantities would probably be different. Now S n and 
A n are functions of the random variables {Xi : 1 < i < n} in the sampling process. As such, they have 
distributions related to the population distribution (the common distribution of the X;). According to the 
central limit theorem, for any reasonable sized sample they should be approximately normally distributed. 
As the examples demonstrating the central limit theorem show, the sample size need not be large in many 
cases. Now if the population mean E [X] is /i and the population variance Var [X] is a 2 , then 

n n 

E [S n ] = J^E [Xi] = nE [X] = nu. and Var [S n ] = ^ Var [Xi] = nVar [X] = no 1 (13.82) 



3 This content is available online at <http://cnx.Org/content/m23496/l.7/>. 
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so that 

E \A n ] = -E \S n ] = fi and Var L4J = -^-Var \S n ] = a 2 In (13.83) 

n n z ' " 

Herein lies the key to the usefulness of a large sample. The mean of the sample average A n is the same as 
the population mean, but the variance of the sample average is 1/n times the population variance. Thus, 
for large enough sample, the probability is high that the observed value of the sample average will be close 
to the population mean. The population standard deviation, as a measure of the variation is reduced by a 
factor \/^fn. 

Example 13.15: Sample size 

Suppose a population has mean /i and variance a 2 . A sample of size n is to be taken. There are 
complementary questions: 

1. If n is given, what is the probability the sample average lies within distance a from the 
population mean? 

2. What value of n is required to ensure a probability of at least p that the sample average lies 
within distance a from the population mean? 

SOLUTION 

Suppose the sample variance is known or can be approximated reasonably. If the sample size n is 
reasonably large, depending on the population distribution (as seen in the previous demonstrations), 
then A n is approximately N (/i, a 2 jn). 

1. Sample size given, probability to be determined. 

A n - /i 



P(\A n -n\ <a) = P 



aWn 



< -^— = 2$ (a^,/a) - 1 (13.84) 



2. Sample size to be determined, probability specified. 

2$ (aVn/cr) - 1 > p iff $ (a^,/a) > ^— (13.85) 

Find from a table or by use of the inverse normal function the value of x = a^/n/a required 
to make $ (x) at least (p+ 1) /2. Then 

n > <r 2 {x/af = (-) x 2 (13.86) 

We may use the MATLAB function norminv to calculate values of x for various p. 

p = [0.8 0.9 0.95 0.98 0.99] ; 
x = norminv (0, 1, (l+p)/2) ; 
disp([p;x;x.~2] ') 

0.8000 1.2816 1.6424 

0.9000 1.6449 2.7055 

0.9500 1.9600 3.8415 

0.9800 2.3263 5.4119 

0.9900 2.5758 6.6349 

For p = 0.95,ct = 2, a = 0.2, n > (2/0.2) 2 3.8415 = 384.15. Use at least 385 or perhaps 400 because 
of uncertainty about the actual a 2 

The idea of a statistic 
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As a function of the random variables in the sampling process, the sample average is an example of a 
statistic. 

Definition. A statistic is a function of the class {X t : 1 < % < n} which uses explicitly no unknown 
parameters of the population. 

Example 13.16: Statistics as functions of the sampling process 

The random variable 

1 " 
W= -Y'iX.-fif, where fi = E[X] (13.87) 

n ^-^ ' 

i=l 

is not a statistic, since it uses the unknown parameter /x. However, the following is a statistic. 



V* 



1 n i n 

-J2( X i~ A nf = ~ E X i ~ A l ( 13 - 88 ) 



n * — ' n 

i=l i=l 



It would appear that V* might be a reasonable estimate of the population variance. However, the following 
result shows that a slight modification is desirable. 

Example 13.17: An estimator for the population variance 

The statistic 



V n = - 
n 

is an estimator for the population variance. 
VERIFICATION 

Consider the statistic 



1 n 

— Y J (X l -A n f (13.89) 



V* 



1 n 1 n 

- E ( X ~ Anf = ~ E X i ~ A l ( 13 - 9 °) 



n *■ — ' n 

i=l j=l 



Noting that E [X 2 ~\ = a 2 + [J, 2 , we use the last expression to show 



E [V:\ = \n {a 2 + , 2 ) -r- + ^) = U -^o 2 (13.91) 



The quantity has a bias in the average. If we consider 



n 

n — 1 

»=i 



1 .". — 1 

— y2( Xl -A n f, then E[V n ] = ^-- o 2 = o 2 (13.92) 

— 1 » — ' n — 1 n 



The quantity V n with 1/ (n — 1) rather than 1/n is often called the sample variance to distinguish 
it from the population variance. If the set of numbers 

(ti,t 2 , ■■■ ,t N ) (13.93) 

represent the complete set of values in a population of JV members, the variance for the population 
would be given by 

i=l \ i=l / 

Here we use 1/N rather than 1/ (N — 1). 
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Since the statistic V n has mean value a 2 , it seems a reasonable candidate for an estimator of the population 
variance. If we ask how good is it, we need to consider its variance. As a random variable, it has a variance. 
An evaluation similar to that for the mean, but more complicated in detail, shows that 

Var [VJ = - ( ha ~ — -o A ) where fi 4 = E \(X - /i) 4 ] (13.95) 

n \ n — 1 J ' L J 

For large n, Var [V n ] is small, so that V n is a good large-sample estimator for a 2 . 

Example 13.18: A sampling demonstration of the CLT 

Consider a population random variable X ~ uniform [-1, 1]. Then E [X] = and Var [X] = 1/3. 
We take 100 samples of size 100, and determine the sample sums. This gives a sample of size 100 of 
the sample sum random variable Sioo, which has mean zero and variance 100/3. For each observed 
value of the sample sum random variable, we plot the fraction of observed sums less than or equal 
to that value. This yields an experimental distribution function for Sioo, which is compared with 
the distribution function for a random variable Y ~ N (0, 100/3). 

rand ( 'seed' ,0) '/, Seeds random number generator for later comparison 

tappr '/, Approximation setup 

Enter matrix [a b] of x-range endpoints [-1 1] 
Enter number of x approximation points 100 
Enter density as a function of t 0.5*(t<=l) 
Use row matrices X and PX as in the simple case 

qsample '/, Creates sample 
Enter row matrix of VALUES X 
Enter row matrix of PROBABILITIES PX 

Sample size n = 10000 '/. Master sample size 10,000 
Sample average ex = . 003746 
Approximate population mean E(X) = 1.561e-17 
Sample variance vx = . 3344 

Approximate population variance V(X) = 0.3333 
m = 100; 

a = reshape (T,m,m) ; '/, Forms 100 samples of size 100 

A = sum(a) ; '/, Matrix A of sample sums 

[t,f] = csort (A,ones(l ,m) ) ; '/, Sorts A and determines cumulative 

p = cumsum(f)/m; '/, fraction of elements <= each value 

pg = gaussian(0, 100/3, t) ; '/, Gaussian dbn for sample sum values 

plot (t ,p, 'k- ' ,t ,pg, 'k- . ') '/, Comparative plot 

'/, Plotting details (see Figure~13.7) 
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Figure 13.7: The central limit theorem for sample sums. 



13.4 Problems on Transform Methods 4 

Exercise 13.1 (Solution on p. 412.) 

Calculate directly the generating function gx (s) for the geometric (p) distribution. 

Exercise 13.2 (Solution on p. 412.) 

Calculate directly the generating function gx (s) for the Poisson (/i) distribution. 

Exercise 13.3 (Solution on p. 412.) 

A projection bulb has life (in hours) represented by X ~ exponential (1/50). The unit will be 
replaced immediately upon failure or at 60 hours, whichever comes first. Determine the moment 
generating function for the time Y to replacement. 

Exercise 13.4 (Solution on p. 412.) 

Simple random variable X has distribution 



X = [-3 - 2 1 4] PX = [0.15 0.20 0.30 0.25 0.10] 



(13.96) 



a. Determine the moment generating function for X. 

b. Show by direct calculation the M x (0) = E [X] and M' x (0) = E [X 2 ] . 



This content is available online at <http://cnx.Org/content/m24424/l.5/>. 
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Exercise 13.5 (Solution on p. 412.) 

Use the moment generating function to obtain the variances for the following distributions 
Exponential (A) Gamma (a, A) Normal (/i,er 2 ) 

Exercise 13.6 (Solution on p. 413.) 

The pair {X, Y} is iid with common moment generating function -. ^ , 3 . Determine the moment 
generating function for Z = 2X — AY + 3. 

Exercise 13.7 (Solution on p. 413.) 

The pair {X, Y} is iid with common moment generating function Mx (s) =(0.6 + 0.4e s ). Determine 
the moment generating function for Z = 5X + 2Y. 

Exercise 13.8 (Solution on p. 413.) 

Use the moment generating function for the symmetric triangular distribution (list, p. 388) on 
{—c,c) as derived 
in the section "Three Basic Transforms". 

a. Obtain an expression for the symmetric triangular distribution on (o, b) for any a < b. 

b. Use the result of part (a) to show that the sum of two independent random variables uniform 
on (a,b) has symmetric triangular distribution on (2a, 26). 

Exercise 13.9 (Solution on p. 413.) 

2 

Random variable X has moment generating function jt-2- — rr. 

° ° (1 — qe s Y 

a. Use derivatives to determine E [X] and Var [X] . 

b. Recognize the distribution from the form and compare E [X] and Var [X] with the result of 
part (a). 

Exercise 13.10 (Solution on p. 413.) 

The pair {X, Y} is independent. X ~ Poisson (4) and Y ~ geometric (0.3). Determine the 
generating function gz for Z = 3X + 2Y. 

Exercise 13.11 (Solution on p. 413.) 

Random variable X has moment generating function 

M x (s) = exp (l6s 2 /2 + 3s) (13.97) 

1 — 3s 

By recognizing forms and using rules of combinations, determine E [X] and Var [Xj. 

Exercise 13.12 (Solution on p. 413.) 

Random variable X has moment generating function 

M x (s) = ea; P(3(e a -l)) £xp , 16s2/2 + 3 x (13 9g) 

1 — 5s 

By recognizing forms and using rules of combinations, determine E [X] and Var [Xj. 

Exercise 13.13 (Solution on p. 414.) 

Suppose the class {A, B, C} of events is independent, with respective probabilities 0.3, 0.5, 0.2. 
Consider 

X = -SI A + 27 B + 47 c (13.99) 



a. Determine the moment generating functions for Ia,Ib,Ic an d use properties of moment 
generating functions to determine the moment generating function for X. 

b. Use the moment generating function to determine the distribution for X. 
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c. Use canonic to determine the distribution. Compare with result (b). 

d. Use distributions for the separate terms; determine the distribution for the sum with mgsum3. 
Compare with result (b). 

Exercise 13.14 (Solution on p. 414.) 

Suppose the pair {X, Y} is independent, with both X and Y binomial. Use generating functions 
to show under what condition, if any, X + Y is binomial. 

Exercise 13.15 (Solution on p. 414.) 

Suppose the pair {X, Y} is independent, with both X and Y Poisson. 

a. Use generating functions to show under what condition X + Y is Poisson. 

b. What about X — Yl Justify your answer. 

Exercise 13.16 (Solution on p. 414.) 

Suppose the pair {X, Y} is independent, Y is nonnegative integer- valued, X is Poisson and X + Y 
is Poisson. Use the generating functions to show that Y is Poisson. 

Exercise 13.17 (Solution on p. 414.) 

Suppose the pair {X, Y} is iid, binomial (6,0.51). By the result of Exercise 13.14 

X + Y is binomial. Use mgsum to obtain the distribution for Z = 2X + AY . Does Z have the 
binomial distribution? Is the result surprising? Examine the first few possible values for Z. Write 
the generating function for Z; does it have the form for the binomial distribution? 

Exercise 13.18 (Solution on p. 415.) 

Suppose the pair {X, Y} is independent, with X ~ binomial (5,0.33) and 

Y ~ binomial (7,0.47). 

Let G = g {X) = 3X 2 - 2X and H = h (Y) = 2Y 2 + Y + 3. 

a. Use the mgsum to obtain the distribution for G + H. 

b. Use icalc and csort to obtain the distribution for G + H and compare with the result of part 
(a). 

Exercise 13.19 (Solution on p. 415.) 

Suppose the pair {X, Y} is independent, with X ~ binomial (8,0.39) and 

Y ~ uniform on {-1.3, - 0.5, 1.3, 2.2, 3.5}. Let 

U = 3X 2 -2X + 1 and V = Y 3 + 2Y - 3 (13.100) 



a. Use mgsum to obtain the distribution for U + V. 

b. Use icalc and csort to obtain the distribution for U + V and compare with the result of part 
(a). 

Exercise 13.20 (Solution on p. 416.) 

If X is a nonnegative integer-valued random variable, express the generating function as a power 
series. 

a. Show that the icth derivative at s = 1 is 

gf (l) = E[X(X-l)(X-2)---(X-k+ 1)] (13.101) 



b. Use this to show the Var [X] = g" x (1) + g x (1) — [g' x (1) 
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Exercise 13.21 (Solution on p. 416.) 

Let Mx (■) be the moment generating function for X. 

a. Show that Var [X] is the second derivative of e~ s ^Mx (s) evaluated at s = 0. 

b. Use this fact to show that if X ~ N (/z, a 2 ), then Var [X] = a 2 . 

Exercise 13.22 (Solution on p. 416.) 

Use derivatives of Mx m (s) to obtain the mean and variance of the negative binomial (m,p) 
distribution. 

Exercise 13.23 (Solution on p. 416.) 

Use moment generating functions to show that variances add for the sum or difference of indepen- 
dent random variables. 

Exercise 13.24 (Solution on p. 416.) 

The pair {X, Y} is iid N (3, 5). Use the moment generating function to show that Z = 3X — 2Y + 3 
is is normal (see Example 3 (Example 13.3: Affine combination of independent normal random 
variables) from "Transform Methods" for general result). 

Exercise 13.25 (Solution on p. 417.) 

Use the central limit theorem to show that for large enough sample size (usually 20 or more) , the 
sample average 

1 ™ 
A n = -Y X, (13.102) 

»=i 

is approximately N (/i, a 2 /n) for any reasonable population distribution having mean value \x and 
variance a 2 . 

Exercise 13.26 (Solution on p. 417.) 

A population has standard deviation approximately three. It is desired to determine the sample 
size n needed to ensure that with probability 0.95 the sample average will be within 0.5 of the mean 
value. 

a. Use the Chebyshev inequality to estimate the needed sample size. 

b. Use the normal approximation to estimate n (see Example 1 (Example 13.15: Sample size) 
from "Simple Random Samples and Statistics"). 
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Solutions to Exercises in Chapter 13 

Solution to Exercise 13.1 (p. 408) 

oo oo 

gx (s) = E \s x ] = Vpts' = p V q k s k = — — (geometric series) (13.103) 

L J ^^1 /-^t 1 — OS 

fc=o fe=o y 

Solution to Exercise 13.2 (p. 408) 



OO OO fc fc 



g x (s) = E [s x ] = Y,PkS k = e"" £ ^rp = e'^ s = e^^ (13.104) 

fe=0 fe=0 

Solution to Exercise 13.3 (p. 408) 

Y = I [0ta] (X)X + I {aiOo) (X)a e sY = I [0M (X)e sX + I (a . oo) (X)e as (13.105) 

M Y (s)= e st Xe' Xt dt + e sa \ e - xt dt (13.106) 

JO J a 

X 
~ X-s 
Solution to Exercise 13.4 (p. 408) 



1 _ (,-(*->)* + e -(A-s)a (13.107) 



M x (s) = 0.15e~ 3s + 0.20e~ 2s + 0.30 + 0.25e s + 0.10e 4s (13.108) 

M x (s) = -3 • 0.15e" 3s - 2 • 0.20e~ 2s + + 0.25e s + 4 • 0.10e 4s (13.109) 

M' x (s) = (-3) 2 • 0.15e~ 3s + (-2) 2 • 0.20e" 2s + + 0.25e s + 4 2 • 0.10e 4s (13.110) 

Setting s = and using e° = 1 give the desired results. 
Solution to Exercise 13.5 (p. 409) 

a. Exponential: 



X , ,' , > A , ,)) . . 2A 



"x («)-■> ; Af x (s)=- -j M x («)_- -j (13.111) 



"W-s-j 1 M - S - i v " w - h - (I)" - * (13I12) 

b. Gamma (a, A): 

M *M=(x^Y ^M = «(a^)" 1 ^ = a {x^Txh ( 13 - 113 ) 

A \ a 1 1 / A \ a 1 



m ^ = ° 2 (a^J a^a^ + ^J (3T^ (13 " 114) 



a „ r „n a + a .. r „, a 



£[X] = T £[X 2 ] = — I- Var[X] = - (13.115) 



A L J A 2 L J A 2 
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c. Normal ( /i, a): 

M x {s)=exp[— + ns\ M x {s) = M x {s)-(<T 2 s + i i) (13.116) 

M' x (s) = M x (*) • (<J 2 s + fi) 2 + M x (s) a 2 (13.117) 

E[X]=n E[X 2 ]=n 2 + a 2 Var [X] = a 2 (13.118) 

Solution to Exercise 13.6 (p. 409) 

Mz ^ = e3 ix^)\xT^) 3 ( 13 - 119 ) 

Solution to Exercise 13.7 (p. 409) 

M z (s) = (0.6 + 0Ae 5s ) (0.6 + 0.4e 2s ) (13.120) 

Solution to Exercise 13.8 (p. 409) 

Let m = (a + b) /2 and c = (b — a) /2. If Y ~ symetric triangular on (—c,c), then X = Y + m is symmetric 
triangular on (m — c, m + c) = (a, b) and 

p cs _|_ P ~ cs — 9 P (m+c)s , (m-c)s _ n„ms bs , as _ o.^s 

M X (*) = e™M Y (s) = e -^ -e m ' = - ±£_ — = + ! „ „ (13.121) 



(^)V 



2 

M x+F («) 



e ^h _ e sa "1 z e s2fe _i_ g«2a _ 2gS(&+a) 



S (b — a) 
Solution to Exercise 13.9 (p. 409) 

2p 2 qe s 



s 2 {b-af 



(13.122) 



\p 2 (l - qe s ) 2 ] = — P qe 3 so that E [X] = 2q/p (13.123) 

L -I (1 — qe s ) 

\p\l - qe T 2 ] " = ^^ + -^^ so that E [X 2 ] = % + 2 S (13.124) 

L J {l-qe s f {l-qe s f l } p 2 p 

Var[X]= 2 ^ + 2 ^ = ^±^= 2 | (13.125) 

X ~ negative binomial (2, p), which has E [X] = 2q/p and Var [X] = 2q/p 2 . 
Solution to Exercise 13.10 (p. 409) 



9z (a) = gx (s 3 ) gy (* 2 ) = e 4 ^" 1 ) • ^= (13.126) 

1 — qs z 

Solution to Exercise 13.11 (p. 409) 

X = X 1 +X 2 with Xi ~ exponential 1/3) X 2 ~ N (3, 16) (13.127) 

£ [X] = 3 + 3 = 6 Var [X] = 9 + 16 = 25 (13.128) 

Solution to Exercise 13.12 (p. 409) 

X = X 1 + X 2 + X 3 , with Xi ~ Poisson (3), X 2 ~ exponential (1/5), V 3 ~ iV (3, 16) (13.129) 

£[V] = 3 + 5 + 3= 11 Var [V] =3 + 25 + 16 = 44 (13.130) 
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Solution to Exercise 13.13 (p. 409) 

M x (s) = (0.7 + 0.3e~ 3s ) (0.5 + 0.5e 2s ) (0.8 + 0.2e 4s ) = 

0.12e~ 3s + 0.12e _s + 0.28 + 0me s + 0.28e 2s + 0.03e 3s + 0.07e 4s + 0.07e 6s 
The distribution is 

X = [-3 - 1 1 2 3 4 6] PX = [0.12 0.12 0.28 0.03 0.28 0.03 0.07 0.07] 



(13.131) 
(13.132) 

(13.133) 



c = [-3 2 4 0]; 

P = 0.1* [3 5 2]; 

canonic 
Enter row vector of coefficients c 
Enter row vector of minterm probabilities 

Use row matrices X and PX for calculations 

Call for XDBN to view the distribution 

PI = [0.7 0.3] 

P2 = [0.5 0.5] 

P3 = [0.8 0.2] 

XI = [0 -3] ; 

X2 = [0 2]; 

X3 = [0 4]; 

[x,px] = mgsum3(Xl,X2,X3,Pl,P2,P3); 

disp([X;PX;x;px] ') 



3 


0000 


0.1200 


-3 


0000 





1200 


1 


0000 


0.1200 


-1 


0000 





1200 







0.2800 










2800 


1 


0000 


0.0300 


1 


0000 





0300 


2 


0000 


0.2800 


2 


0000 





2800 


3 


0000 


0.0300 


3 


0000 





0300 


4 


0000 


0.0700 


4 


0000 





0700 


6 


0000 


0.0700 


6 


0000 





0700 



Solution to Exercise 13.14 (p. 410) 

Binomial iff both have same p, as shown below. 



minprob(P) 



9x+y (a) = (gi + Pis) n (q 2 + P2s) m = {q + ps) n m iff p x = p 2 



Solution to Exercise 13.15 (p. 410) 

Always Poisson, as the argument below shows. 



gx+Y (s) = e e = e 



(13.134) 



(13.135) 



However, Y — X could have negative values. 
Solution to Exercise 13.16 (p. 410) 

E [X + Y] = [i + is, where v = E [Y] > 0. g x (a) = e^ s '^ and g x+Y (s) = gx (a) gv (a) = e^ +v ^ s ~ 1 \ 
Division by gx (a) gives gy (a) = e l/ ^ 1 \ 
Solution to Exercise 13.17 (p. 410) 
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x = 0:6; 
px = ibinom(6,0. 51 ,x) ; 
[Z,PZ] = mgsum(2*x,4*x,px,px) ; 
disp([Z(l:5);PZ(l:5)]') 

'/, Cannot be binomial, since odd values missing 








0. 


.0002 


2. 


.0000 


0, 


.0012 


4. 


,0000 


0. 


.0043 


6. 


.0000 


0. 


.0118 


8. 


,0000 


0, 


.0259 



gx ( s ) = g Y ( s ) = (0.49 + 0.51s) 6 g z (s) = (0.49 + 0.51s 2 ) 6 (0.49 + 0.51s 4 ) 6 (13.136) 

Solution to Exercise 13.18 (p. 410) 

X = 0:5; 
Y = 0:7; 

PX = ibinom(5, 0.33.X); 
PY = ibinom(7,0.47,Y); 
G = 3*X.~2 - 2*X; 
H = 2*Y.~2 + Y + 3; 
[Z,PZ] = mgsum(G,H,PX,PY); 



icalc 

Enter row matrix of X-values X 
Enter row matrix of Y-values Y 
Enter X probabilities PX 
Enter Y probabilities PY 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
M = 3*t.~2 - 2*t + 2*u.~2 + u + 3; 
[z,pz] = csort(M,P) ; 

e = max(abs(pz - PZ)) '/, Comparison of p values 
e = 

Solution to Exercise 13.19 (p. 410) 

X = 0:8; 

Y = [-1.3 -0.5 1.3 2.2 3.5] ; 
PX = ibinom(8, 0.39.X); 

PY = (l/5)*ones(l,5); 
U = 3+X.-2 - 2*X + 1; 

V = Y.~3 + 2*Y - 3; 
[Z,PZ] = mgsum(U,V,PX,PY); 
icalc 

Enter row matrix of X-values X 
Enter row matrix of Y-values Y 
Enter X probabilities PX 
Enter Y probabilities PY 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
M = 3*t.~2 - 2*t + 1 + u.~3 + 2*u - 3; 
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[z,pz] = csort(M,P) ; 
e = max(abs(pz - PZ)) 
e = 

Solution to Exercise 13.20 (p. 410) 

Since power series may be differentiated term by term 



(,,) 



9x 



(s) = Y^ k (k - 1) • • • (k - n + l)p k s k - n so that (13.137) 



k=n 



» 



9x - (l) = J2k(k- 1) • • • (k - n + l)p k = E[X (X - 1) • • • (X - n+ 1)] (13.138) 

Var [X] = E [X 2 ] - E 2 [X] = E [X (X - 1)} + E [X] - E 2 [X] = g" x (1) + g x (1) - [g x (1)] 2 (13.139) 
Solution to Exercise 13.21 (p. 411) 

/ (a) = e-"*M x (a) /" (*) = e" s " [-^M x (a) + n 2 M x (a) + M x (s) - [ iM x (a)] (13.140) 

Setting s = and using the result on moments gives 

/" (0) = -fi 2 + fi 2 + E [X 2 ] - fJ 2 = Var [X] (13.141) 

Solution to Exercise 13.22 (p. 411) 

To simplify writing use / (s) for M x (S). 

p m , mp m qe s „ mp m qe s m (m + 1) p m q 2 e 2s , 1Q1/10 x 

/(s)= n — ^r /(s)= n ^+i ; {s)= n i^+t + ^ s ^+2 ( 13 - 142 ) 

[i — qe ) (1 — qe s ) (1 — qe s ) (1 — qe s ) 



mp m q mq r 2 n m 9 m(m+l)p m q 2 

(l-q) m+1 = ~P~ ^ * = T + (l-q) r 



„. , mp q mq „ r 9n m<7 m (m + l p q 

E l X ] = T, l + i = ^T E i X } = ^ + ^ ^2^ ( 13 - 143 ) 



Var [X] = m? + ^(m+l) q 2 _rn 2 e = mq 
p p 2 p 2 p 2 

Solution to Exercise 13.23 (p. 411) 

To simplify writing, set / (a) = M x (a), g (s) = My (s), and h (s) = M x (s) My (s) 

ti (a) = f (a) g(s)+f (a) g (a) h" (a) = f" (a) g (a) + f (a) g (a) + f (a) g (a) + f (a) g" (a) (13.145) 
Setting a = yields 

E [X + Y] = E [X] + E [Y] E \{X + Y) 2 ] = E [X 2 ] + 2E [X] E [Y] + E [Y 2 ] E 2 [X + Y]= (13.146) 

E 2 [X] + 2E [X] E [Y] + E 2 [Y] (13.147) 

Taking the difference gives Var [X + Y] = Var [X] + Var [Y). A similar treatment with g (a) replaced by 
g {-a) shows Var [X -Y}= Var [X] + Var [Y]. 
Solution to Exercise 13.24 (p. 411) 

/9-5s 2 \ /4-5s 2 \ 

M 3X (s) = M x (3s) = exp h 3 • 3s M_ 2Y (a) = My (-2s) = exp 2 • 3s (13.148) 



M z (s) = e 3s exp ( (45 + 20) - + (9 - 6) a) = exp (^- + 6s j (13.149) 
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Solution to Exercise 13.25 (p. 411) 

1 n 1 ™ 2 

E [A n ] = - V n = n Var [A n ] = — V a 2 = — (13.150) 

<n ^ — * ti^ *■ — * r> 



n * — ' n* * — ' n 

i=l i=l 



By the central limit theorem, A n is approximately normal, with the mean and variance above. 
Solution to Exercise 13.26 (p. 411) 



Chebyshev inequality: 



. \A n - ll\ 0.5y/n\ 3 2 

P ' " ,1^ > — — < — 5- < 0.05 implies n > 720 13.151 

v <T/V n 3 / 0.5^ 



-n 



Normal approximation: Use of the table in Example 1 (Example 13.15: Sample size) from "Simple 
Random Samples and Statistics" shows 

n > (3/0.5) 2 3.84 = 128 (13.152) 
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Chapter 14 

Conditional Expectation, Regression 



14.1 Conditional Expectation, Regression 1 

Conditional expectation, given a random vector, plays a fundamental role in much of modern probability 
theory. Various types of "conditioning" characterize some of the more important random sequences and pro- 
cesses. The notion of conditional independence is expressed in terms of conditional expectation. Conditional 
independence plays an essential role in the theory of Markov processes and in much of decision theory. 

We first consider an elementary form of conditional expectation with respect to an event. Then we 
consider two highly intuitive special cases of conditional expectation, given a random variable. In examin- 
ing these, we identify a fundamental property which provides the basis for a very general extension. We 
discover that conditional expectation is a random quantity. The basic property for conditional expectation 
and properties of ordinary expectation are used to obtain four fundamental properties which imply the "ex- 
pectationlike" character of conditional expectation. An extension of the fundamental property leads directly 
to the solution of the regression problem which, in turn, gives an alternate interpretation of conditional 
expectation. 

14.1.1 Conditioning by an event 

If a conditioning event C occurs, we modify the original probabilities by introducing the conditional proba- 
bility measure P (-\C). In making the change from 



P(A) to P(A\C) = ^± (14.1) 



we effectively do two things: 



- We limit the possible outcomes to event C 

- We "normalize" the probability mass by taking P (C) as the new unit 

It seems reasonable to make a corresponding modification of mathematical expectation when the occurrence 
of event C is known. The expectation E [X] is the probability weighted average of the values taken on by 
X. Two possibilities for making the modification are suggested. 

• We could replace the prior probability measure P (•) with the conditional probability measure P (-\C) 
and take the weighted average with respect to these new weights. 

• We could continue to use the prior probability measure P (•) and modify the averaging process as 
follows: 



lr This content is available online at <http://cnx.Org/content/m23634/l.5/>. 
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- Consider the values X (oj) for only those u> € C. This may be done by using the random variable 
IcX which has value X [to) for lo s C and zero elsewhere. The expectation E [IcX] is the 
probability weighted sum of those values taken on in C. 

- The weighted average is obtained by dividing by P (C). 

These two approaches are equivalent. For a simple random variable X = J22=i tklA k in canonical form 

n n n 

E [I C X] /P {C) = Y J E [t k IclA k ] IP (C) = ]T t k P (CA k ) IP (C) = J2 tkP (A k \C) (14.2) 

fe=i fe=i fe=i 

The final sum is expectation with respect to the conditional probability measure. Arguments using basic 
theorems on expectation and the approximation of general random variables by simple random variables 
allow an extension to a general random variable X. The notion of a conditional distribution, given C, and 
taking weighted averages with respect to the conditional probability is intuitive and natural in this case. 
However, this point of view is limited. In order to display a natural relationship with more the general 
concept of conditioning with repspect to a random vector, we adopt the following 

Definition. The conditional expectation of X, given event C with positive probability, is the quantity 

Fmn _ E[IcX] _ E[IcX] 

Remark. The product form E [X\C] P (C) = E [I C X] is often useful. 

Example 14.1: A numerical example 

Suppose X ~ exponential (A) and C = {1/A < X < 2/A}. Now I c = I M {X) where M = 

[1/A,2/A]. 

P{C) = P(X > 1/A) -P(X > 2/A) = e~ 1 - e" 2 and (14.4) 

//•2/A -, 

I M (t) t\e~ xt dt = / t\e~ xt dt = - ^e" 1 - 3e" 2 ) (14.5) 

Ji/\ A 



Thus 



ElXlC] = £lf^.l^ ( 14 .6) 

A (e i — e z ) A 



14,1,2 Conditioning by a random vector — discrete case 

Suppose X = J^r=i tilAi and Y = J2T=i u j^B :j in canonical form. We supposeP (Ai) = P (X = ti) > and 
P (Bj) = P (Y = Uj) > 0, for each permissible i, j. Now 



P(Y = u j \X = t i )= ' p{x ' =ti) J < (14-7) 

We take the expectation relative to the conditional probability P {-\X = ti) to get 



E[g(Y) \X = U] = Y,9(u 3 )P(Y = Uj \X = U) = e (ti) (14.S 
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Since we have a value for each t; in the range of X, the function e (•) is defined on the range of X. Now 
consider any reasonable set M on the real line and determine the expectation 



E [I M (X) g (Y)} =J2J2 Im (**) 9 K) P(X = U,Y= Uj ) 
»=i j=i 



y J 1m (u 



Y,9i.u )P{Y = u J \X = t l ) 



j'=i 



P(X = U 



J2 Im (U) e (U) P(X = t t ) = E [I M (X) e (X)} 



We have the pattern 



(A) E [I M (X) g (Y)] = E [I M (X) e (X)} where e (U) =E[g (Y) \X = U] 



(14.9) 

(14.10) 
(14.11) 

(14.12) 



for all tj in the range of X. 

We return to examine this property later. But first, consider an example to display the nature of the 
concept. 

Example 14.2: Basic calculations and interpretation 

Suppose the pair {X, Y} has the joint distribution 



P(X = ti,Y = Uj ) 



(14.13) 



X = 





1 


4 


9 


Y = 2 


0.05 


0.04 


0.21 


0.15 





0.05 


0.01 


0.09 


0.10 


-1 


0.10 


0.05 


0.10 


0.05 


PX 


0.20 


0.10 


0.40 


0.30 



Table 14.1 

Calculate E [Y\X = ij for each possible value t; taken on by X 

E[Y\X = 0] = -l^ 0+ 0^ + 2^ 

= (-1 • 0.10 + • 0.05 + 2 • 0.05) /0.20 = 

E [Y\X = 1] = (-1 • 0.05 + • 0.01 + 2 • 0.04) /0.10 = 0.30 

E [Y\X = 4] = (-1 • 0.10 + • 0.09 + 2 • 0.21) /0.40 = 0.80 

E [Y\X = 9] = (-1 • 0.05 + • 0.10 + 2 • 0.15) /0.10 = 0.83 

The pattern of operation in each case can be described as follows: 

• For the ith column, multiply each value Uj by P (X = ti, Y = Uj), sum, then divide by 
P(X = U). 

The following interpretation helps visualize the conditional expectation and points to an important 
result in the general case. 
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• For each t; we use the mass distributed "above" it. This mass is distributed along a vertical 
line at values Uj taken on by Y. The result of the computation is to determine the center of 
mass for the conditional distribution above t = ti- As in the case of ordinary expectations, 
this should be the best estimate, in the mean-square sense, of Y when X = ti. We examine 
that possibility in the treatment of the regression problem in Section 14.1.5 (The regression 
problem). 

Although the calculations are not difficult for a problem of this size, the basic pattern can be 
implemented simply with MATLAB, making the handling of much larger problems quite easy. This 
is particularly useful in dealing with the simple approximation to an absolutely continuous pair. 

X = [0 1 4 9] ; '/, Data for the joint distribution 

Y = [-10 2]; 

P = 0.01*[ 5 4 21 15; 5 1 9 10; 10 5 10 5]; 
jcalc '/, Setup for calculations 

Enter JOINT PROBABILITIES (as on the plane) P 
Enter row matrix of VALUES of X X 
Enter row matrix of VALUES of Y Y 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
EYX = sum(u. *P) . /sum(P) ; '/, sum(P) = PX (operation sum yields column sums) 
disp([X;EYX] ') '/. u.*P = u_j P(X = t_i, Y = u_j) for all i, j 



1.0000 0.3000 

4.0000 0.8000 

9.0000 0.8333 

The calculations extend to E [g (X, Y) \X = ti}. Instead of values of Uj we use values of g (ti, Uj) in 
the calculations. Suppose Z = g (X, Y) = Y 2 - 2XY. 

G = u.~2 - 2*t.*u; '/. Z = g(X,Y) = Y~2 - 2XY 

EZX = sum(G.*P)./sum(P); '/. E[Z|X=x] 
disp([X;EZX] ') 

1.5000 

1.0000 1.5000 

4.0000 -4.0500 

9.0000 -12.8333 



14,1,3 Conditioning by a random vector — absolutely continuous case 

Suppose the pair {X, Y} has joint density function fxY- We seek to use the concept of a conditional 
distribution, given X = t. The fact that P (X = t) = for each t requires a modification of the approach 
adopted in the discrete case. Intuitively, we consider the conditional density 

f Mri J fxY(t,u)/f x (t) for f x (t)>0 

fy\x {u\t) = { (14.14) 

elsewhere 

The condition fx (t) > effectively determines the range of X. The function fy\x ('\t) has the properties 
of a density for each fixed t for which fx (t) > 0. 



h\x (u\t) > 0, / f Y{x (u\t) du = j— y / Jxy (t, u) du = f x (t) /fx (t) = 1 (14.15) 
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We define, in this case, 



E [g (Y) \X = t] = J g (u) f Y]x (u\t) du = e (t) (14.16) 

The function e (•) is defined for fx (t) > 0, hence effectively on the range of X. For any reasonable set M 
on the real line, 



E [I M (X) g (Y)] = / I M (t)g (u) fxr (t, «) dudt = / I M (t) 



9 (u) f Y \x (u\i) du 



f x (t) dt (14.17) 



= J I M (t) e (t) f x (t) dt, where e(t) = E [g (Y) \X = t] (14.18) 

Thus we have, as in the discrete case, for each t in the range of X. 

(A) E\I M (X)g(Y)] = E\I M (X)e{X)] where e (t) = E [g (Y) \X = t] (14.19) 

Again, we postpone examination of this pattern until we consider a more general case. 
Example 14.3: Basic calculation and interpretation 

6 
5 



Suppose the pair {X, Y} has joint density fxY {t, u) = I (t + 2u) on the triangular region bounded 



by t = 0, u = 1, and u = t (see Figure 14.1). Then 

6 f 1 



f x (t) = - / (t + 2u) du = - (1 + t - 2t 2 ) , < t < 1 (14.20) 

5 J t 5 

By definition, then, 



fy\x i u \t) = o on ^^ e triangle (zero elsewhere) (14.21) 



We thus have 



/l I' 1 4 + 3t — 7i 3 

uf Y \ x (u\t) du = 1 + t _ 2t2 j t (tu + 2u 2 ) du = 6{1 + t _ 2t2) < t < 1 (14.22) 

Theoretically, we must rule out t = 1 since the denominator is zero for that value of t. This causes 
no problem in practice. 
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(1,1) 



Figure 14.1: The density function for Example 14.3 (Basic calculation and interpretation) 



We are able to make an interpretation quite analogous to that for the discrete case. This also points the 
way to practical MATLAB calculations. 

• For any t in the range of X (between and 1 in this case), consider a narrow vertical strip of width 
At with the vertical line through t at its center. If the strip is narrow enough, then fxY {t, u) does 
not vary appreciably with t for any u. 

• The mass in the strip is approximately 



Mass w At / fxY (t, u) du = Atf x (t) 



• The moment of the mass in the strip about the line u = is approximately 

Moment k, At I ufxy (t, u) du 



(14.23) 



(14.24) 



• The center of mass in the strip is 

Moment 
Center of mass 



Mass 



At J ufxy (t, u) du 
Atf x (t) 



uf Y \x{u\i) du = e{t) 



(14.25) 



This interpretation points the way to the use of MATLAB in approximating the conditional expectation. 
The success of the discrete approach in approximating the theoretical value in turns supports the validity 
of the interpretation. Also, this points to the general result on regression in the section, "The Regression 
Problem" (Section 14.1.5: The regression problem). 

In the MATLAB handling of joint absolutely continuous random variables, we divide the region into 
narrow vertical strips. Then we deal with each of these by dividing the vertical strips to form the grid 
structure. The center of mass of the discrete distribution over one of the t chosen for the approximation 
must lie close to the actual center of mass of the probability in the strip. Consider the MATLAB treatment 
of the example under consideration. 



425 



f = '(6/5)*(t + 2*u) .*(u>=t) ' ; '/. Density as string variable 

tuappr 

Enter matrix [a b] of X-range endpoints [0 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 

Enter expression for joint density eval(f) '/, Evaluation of string variable 
Use array operations on X, Y, PX, PY, t, u, and P 

EYx = sum(u. *P) . /sum(P) ; '/, Approximate values 

eYx = (4 + 3*X - 7*X.~3)./(6*(1 + X - 2*X.~2)); '/. Theoretical expression 
plot (X, EYx, X, eYx) 
7, Plotting details (see Figure~14. 2) 

— □ 



0.95- 



0.9- 



0.85- 



> 
x 

hi 



0.8- 



0.75- 



0.7- 



0.65 



Theoretical and Approximate Conditional Expectation 



1 1 1 1 1 1 1 1 1 . 


fXY(t,u) = (6/5)(t + 2u), 




for 0<=t<=u<=1 








- 






- 


Appiuxiinale 




Theoretical 








-^ ~^~^ 



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

t 



Figure 14.2: Theoretical and approximate conditional expectation for above (p. 424). 



The agreement of the theoretical and approximate values is quite good enough for practical purposes. It 
also indicates that the interpretation is reasonable, since the approximation determines the center of mass 
of the discretized mass which approximates the center of the actual mass in each vertical strip. 



14.1.4 Extension to the general case 

Most examples for which we make numerical calculations will be one of the types above. Analysis of 
these cases is built upon the intuitive notion of conditional distributions. However, these cases and this 
interpretation are rather limited and do not provide the basis for the range of applications — theoretical and 
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practical — which characterize modern probability theory. We seek a basis for extension (which includes the 
special cases). In each case examined above, we have the property 

(A) E\I M (X)g(Y)] = E\I M (X)e(X)] where e (t) = E [g (Y) \X = t] (14.26) 

for all t in the range of X. 

We have a tie to the simple case of conditioning with respect to an event. If C = {X s M} has positive 
probability, then using Ic = Im (X) we have 

(B) E [I M (X) g (Y)} =E[g (Y) \X e M] P (X e M) (14.27) 

Two properties of expectation are crucial here: 

1. By the uniqueness property (E5) ("(E5) ", p. 600), since (A) holds for all reasonable (Borel) sets, then 
e (X) is unique a.s. (i.e., except for a set of ui of probability zero). 

2. By the special case of the Radon Nikodym theorem (E19) ("(E19)", p. 600), the function e(-)always 
exists and is such that random variable e (X) is unique a.s. 

We make a definition based on these facts. 

Definition. The conditional expectationE [g (Y) \X = t] = e (t) is the a.s. unique function defined on 
the range of X such that 

{A) E [I M {X) g {Y)\ = E [I M {X) e {X)\ for all Borel sets M (14.28) 

Note that e (X) is a random variable and e (•) is a function. Expectation E [g (Y)] is always a constant. 
The concept is abstract. At this point it has little apparent significance, except that it must include the 
two special cases studied in the previous sections. Also, it is not clear why the term conditional expectation 
should be used. The justification rests in certain formal properties which are based on the defining condition 
(A) and other properties of expectation. 

In Appendix F we tabulate a number of key properties of conditional expectation. The condition (A) 
is called property (CE1) (p. 426). We examine several of these properties. For a detailed treatment and 
proofs, any of a number of books on measure-theoretic probability may be consulted. 
(CE1) Denning condition, e (X) = E [g (Y) \X] a.s. iff 

E [I M {X) g (Y)] = E [I M {X) e {X)} for each Borel set M on the codomain of X (14.29) 

Note that X and Y do not need to be real valued, although g (Y) is real valued. This extension to possible 
vector valued X and Y is extremely important. The next condition is just the property (B) noted above. 

(CEla) If P (X e M) > 0, then E [I M {X) e {X)] = E[g (Y) \X e M] P {X E M) 

The special case which is obtained by setting M to include the entire range of X so that Im (X (w)) = 1 
for all u> is useful in many theoretical and applied problems. 

(CElb) Law of total probability. E [g (Y)] = E{E [g (Y) \X]} 

It may seem strange that we should complicate the problem of determining E [g (Y)] by first getting the 
conditional expectation e (X) = E[g (Y) \X] then taking expectation of that function. Frequently, the data 
supplied in a problem makes this the expedient procedure. 

Example 14.4: Use of the law of total probability 

Suppose the time to failure of a device is a random quantity X ~ exponential («), where the 
parameter u is the value of a parameter random variable H. Thus 

f x]H (t\u) = ue~ ut for t > (14.30) 

If the parameter random variable H ~ uniform (a, b), determine the expected life E [X] of the 
device. 

SOLUTION 
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We use the law of total probability: 

E [X] = E{E [X\H]} = J E [X\H = u] f H (u) du (14.31) 

Now by assumption 

E[X\H = u] = l/u and f H (u) = , a<u<b (14.32) 

b — a 

Thus 

1 f b 1 In (b/a) 

E [X] = / -du= y ' ' (14.33) 



a J a u 

For a = 1/100, b = 2/100, E [X] = WOln (2) w 69.31. 

The next three properties, linearity, positivity/monotonicity, and monotone convergence, along with the 
defining condition provide the "expectation like" character. These properties for expectation yield most 
of the other essential properties for expectation. A similar development holds for conditional expectation, 
with some reservation for the fact that e (X) is a random variable, unique a.s. This restriction causes little 
problem for applications at the level of this treatment. 

In order to get some sense of how these properties root in basic properties of expectation, we examine 
one of them. 

(CE2) Linearity. For any constants a, b 

E [ag (Y) + bh (Z) \X] = aE [g (Y) \X] + bE [h (Z) \X] a.s. (14.34) 

VERIFICATION 

Let ei {X) = E[g (Y) \X] , e 2 {X) = E[h (Z) \X] , and e (X) = E [ag (Y) + bh (Z) \X] a.s. . 

E\I M {X)e(X)] = E{I M (X)\ag(Y) + bh(Z)]} a.s. by (CE1) 

= aE [Im (X) g (Y)} + bE [Im (X) h (Z)\ a.s. by linearity of expectation 

= aE [I M (X) ei (X)} + bE [I M (X) e 2 (X)} a.s. by (CE1) 

= E{Im (X) [aei (X) + be 2 {X)]} a.s. by linearity of expectation 

Since the equalities hold for any Borel M, the uniqueness property (E5) ("(E5) ", p. 600) for expectation 
implies 

e(X) = aei(X) + be 2 {X) a.s. (14.35) 

This is property (CE2) (p. 427). An extension to any finite linear combination is easily established by 
mathematical induction. 
— □ 

Property (CE5) (p. 427) provides another condition for independence. 
(CE5) Independence. {X, Y} is an independent pair 

iff E [g (Y) \X] = E[g (Y)} a.s. for all Borel functions g 

iff E [I N {Y) \X] = E [I N (V)] a.s. for all Borel sets JV on the codomain of Y 

Since knowledge of X does not affect the likelihood that Y will take on any set of values, then conditional 
expectation should not be affected by the value of X. The resulting constant value of the conditional expec- 
tation must be E [g (Y)] in order for the law of total probability to hold. A formal proof utilizes uniqueness 
(E5) ("(E5) ", p. 600) and the product rule (E18) ("(E18)", p. 600) for expectation. 

Property (CE6) (p. 427) forms the basis for the solution of the regresson problem in the next section. 

(CE6) e{X) = E [g (Y) \X] a.s. iff E [h {X) g (Y)] = E[h {X) e {X)} a.s. for any Borel function h 
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Examination shows this to be the result of replacing Im{X) in (CE1) (p. 426) with arbitrary h(X). 
Again, to get some insight into how the various properties arise, we sketch the ideas of a proof of (CE6) (p. 
427). 

IDEAS OF A PROOF OF (CE6) (p. 427) 

1. For h{X) = I M {X), this is (CE1) (p. 426). 

2. For h (X) = J^iLi a i^Mi (A), the result follows by linearity. 

3. For h > 0, g > 0, there is a seqence of nonnegative, simple h n [U+2197] h. Now by positivity, e (X) > 0. 
By monotone convergence (CE4), 

E [h n {X) g (Y)} [U+2197] E [h {X) g (Y)\ and E [h n {X) e {X)] [U+2197] E [h {X) e {X)\ (14.36) 

Since corresponding terms in the sequences are equal, the limits are equal. 

4. For h = h + - hr , g > 0, the result follows by linearity (CE2) (p. 427). 

5. For g = g + — g~ , the result again follows by linearity. 

— □ 

Properties (CE8) (p. 428) and (CE9) (p. 428) are peculiar to conditional expectation. They play an 
essential role in many theoretical developments. They are essential in the study of Markov sequences and 
of a class of random sequences known as submartingales. We list them here (as well as in Appendix F) for 
reference. 

(CE8) E [h {X) g (Y) \X] = h{X)E [g (Y) \X] a.s. for any Borel function h 

This property says that any function of the conditioning random vector may be treated as a constant 
factor. This combined with (CE10) (p. 428) below provide useful aids to computation. 

(CE9) Repeated conditioning 

If X = h(W), then E{E [g (Y) \X] \W} = E{E [g (Y) \W] \X} =E[g (Y) \X] a.s. (14.37) 

This somewhat formal property is highly useful in many theoretical developments. We provide an interpre- 
tation after the development of regression theory in the next section. 

The next property is highly intuitive and very useful. It is easy to establish in the two elementary cases 
developed in previous sections. Its proof in the general case is quite sophisticated. 
(CE10) Under conditions on g that are nearly always met in practice 

a. E [g (X, Y)\X = t] = E [g (t, Y) \X = t] a.s. [P x ] 

b. If {X, Y} is independent, then E [g {X, Y) \X = t] = E [g (t, Y)] a.s. [P x ] 

It certainly seem reasonable to suppose that if X = t, then we should be able to replace X by t in 
E [g {X, Y) \X = t] to get E [g (t, Y) \X = t]. Property (CE10) (p. 428) assures this. If {AT, Y} is an indepen- 
dent pair, then the value of X should not affect the value of Y, so that E [g (t, Y) \X = t] = E [g (t, Y)} a.s. . 

Example 14.5: Use of property (CE10) (p. 428) 

Consider again the distribution for Example 14.3 (Basic calculation and interpretation). The pair 

{X, Y} has density 

fxY (t, u) = - (t + 2u) on the triangular region bounded by t = 0, u = 1, and u = t (14.38) 

5 

We show in Example 14.3 (Basic calculation and interpretation) that 

4 4- M — 7t 3 

E[Y\X = t] = — — ^ -^ < i < 1 (14.39) 

1 ' J 6(l-M-2£ 2 ) - v ' 

Let Z = 3A 2 + 2XY. Determine E [Z\X = t]. 
SOLUTION 
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By linearity, (CE8) (p. 428), and (CE10) (p. 428) 

E [Z\X = t] = 3t 2 + ItE [Y\X = t] = 3t 2 + 3(1 + +f _ 2t2) (14.40) 

Conditional probability 

In the treatment of mathematical expectation, we note that probability may be expressed as an expec- 
tation 

P(E) = E\I E ] (14.41) 

For conditional probability, given an event, we have 

In this manner, we extend the concept conditional expectation. 
Definition. The conditional probability of event E, given X, is 

P{E\X) = E[I E \X] (14.43) 

Thus, there is no need for a separate theory of conditional probability. We may define the conditional 
distribution function 

F Ylx (u\X) = P(Y< u\X) = E [I^oo.u] (Y) \X] (14.44) 

Then, by the law of total probability (CElb) (p. 426), 

F Y («) = E [F Ylx (u\X)} = J F Ylx (u\t) F x (dt) (14.45) 

If there is a conditional density fy\x such that 

P(YeM\X = t)=[ fy lx (r\t)dr (14.46) 

J M 

then 

Fy\ X (u\t) = J f Y]x (r\t) dr so that f Y]x (u\t) = -^F Y]X (u\t) (14.47) 

A careful, measure-theoretic treatment shows that it may not be true that Fy\x (" I*) ls a distribution function 
for all t in the range of X. However, in applications, this is seldom a problem. Modeling assumptions often 
start with such a family of distribution functions or density functions. 

Example 14.6: The conditional distribution function 

As in Example 14.4 (Use of the law of total probability), suppose X ~ exponential (u), where the 
parameter u is the value of a parameter random variable H. If the parameter random variable H ~ 
uniform (a, b), determine the distribution function F X - 

SOLUTON 

As in Example 14.4 (Use of the law of total probability), take the assumption on the conditional 
distribution to mean 

fx\H (t\u) = ue~ ut t>0 (14.48) 

Then 

F X]H (t\u) = [ ue~ us ds = l- e~ ut < t (14.49) 

Jo 
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By the law of total probability 

Fx(t)= f F X{H (t\u) f H (u) du 



1 



b — a 
1 



t(b-a) 
Differentiation with respect to t yields the expression for fx (t) 



- e~ ut ) du = 


1- 


i r b 

/ r~ ut du 


(14.50) 


b-a J a 


e -bt _ e -„t] 






(14.51) 



fx (t) ' 



b — a 



1 b \ .-bt I * , a \ „-at 



t*'t) e -\Y^t ]< 



t > (14.52) 



The following example uses a discrete conditional distribution and marginal distribution to obtain the joint 
distribution for the pair. 

Example 14.7: A random number JV of Bernoulli trials 

A number JV is chosen by a random selection from the integers from 1 through 20 (say by drawing 
a card from a box). A pair of dice is thrown JV times. Let S be the number of "matches" (i.e., both 
ones, both twos, etc.). Determine the joint distribution for {N, S}. 

SOLUTION 

N ~ uniform on the integers 1 through 20. P (N = i) = 1/20 for 1 < i < 20. Since there are 
36 pairs of numbers for the two dice and six possible matches, the probability of a match on any 
throw is 1/6. Since the i throws of the dice constitute a Bernoulli sequence with probability 1/6 
of a success (a match), we have S conditionally binomial (i, 1/6), given N = i. For any pair (i,j), 
0<j<i, 

P(N = i,S = j) = P{S = j\N = i)P(N = i) (14.53) 

Now E [S\N = i} = i/6. so that 

20 

E [S] = ---yi=^^= 7 - = 1.75 (14.54) 

L J 6 20 ^ 6 • 20 • 2 4 y ' 

i=l 

The following MATLAB procedure calculates the joint probabilities and arranges them "as on the 
plane." 

7, file randbern . m 
p = input ('Enter the probability of success '); 
N = input ('Enter VALUES of N '); 
PN = input ('Enter PROBABILITIES for N '); 
n = length (N) ; 
m = max(N) ; 
S = 0:m; 

P = zeros(n,m+l) ; 
for i = l:n 

P(i,l:N(i)+l) = PN(i)*ibinom(N(i),p,0:N(i)); 
end 

PS = sum(P) ; 
P = rot90(P) ; 

disp(' Joint distribution N, S, P, and marginal PS') 
randbern '/, Call for the procedure 

Enter the probability of success 1/6 
Enter VALUES of N 1:20 
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Enter PROBABILITIES for N .05*ones(l,20) 

Joint distribution N, S, P, and marginal PS 

ES = S+PS' 

ES = 1.7500 °/. Agrees with the theoretical value 



14.1.5 The regression problem 

We introduce the regression problem in the treatment of linear regression. Here we are concerned with 
more general regression. A pair {A, Y} of real random variables has a joint distribution. A value X (w) 
is observed. We desire a rule for obtaining the "best" estimate of the corresponding value Y (u>). If Y (u>) 
is the actual value and r (X (w)) is the estimate, then Y (u>) — r (X (uj)) is the error of estimate. The best 
estimation rule (function) r (•) is taken to be that for which the average square of the error is a minimum. 
That is, we seek a function r such that 



E[(Y 



r{X)) is a minimum. (14.55) 



In the treatment of linear regression, we determine the best affine function, u = at + b. The optimum 
function of this form defines the regression line of Y on X. We now turn to the problem of finding the best 
function r, which may in some cases be an affine function, but more often is not. 

We have some hints of possibilities. In the treatment of expectation, we find that the best constant to 
approximate a random variable in the mean square sense is the mean value, which is the center of mass for 
the distribution. In the interpretive Example 14.2.1 for the discrete case, we find the conditional expectation 
E [Y\X = ti] is the center of mass for the conditional distribution at X = fy. A similar result, considering thin 
vertical strips, is found in Example 14.2 (Basic calculations and interpretation) for the absolutely continuous 
case. This suggests the possibility that e(t) = E [Y\X = t] might be the best estimate for Y when the value 
X (to) = t is observed. We investigate this possibility. The property (CE6) (p. 427) proves to be key to 
obtaining the result. 

Let e(X) = E[Y\X}. We may write (CE6) (p. 427) in the form E [h (X) (Y - e (X))] = for any 
reasonable function h. Consider 

e\(Y -r (AT)) 2 ] = e\{Y -e{X) + e (A) - r (A)) 2 ] (14.56) 

= E[(Y-e (A)) 2 ] + E [(e (A) - r (A)) 2 ] + IE \{Y - e (A)) (r (A) - e (A))] (14.57) 

Now e (A) is fixed (a.s.) and for any choice of r we may take h (A) = r (A) — e (A) to assert that 

E[(Y-e (A)) (r (A) - e (A))] =E[(Y-e (A)) h (A)] = (14.58) 

Thus 

e[(Y-t (A)) 2 ] =E[(Y-e (A)) 2 ] + E [(e (A) - r (A)) 2 ] (14.59) 

The first term on the right hand side is fixed; the second term is nonnegative, with a minimum at zero iff 
r (A) = e (A) a.s. Thus, r = e is the best rule. For a given value A (w) = t the best mean square estimate 
of Y is 

u = e(t)=E[Y\X = t] (14.60) 

The graph of u = e (t) vs t is known as the regression curve of Y on X. This is defined for argument t in 
the range of X, and is unique except possibly on a set N such that P (A s N) = 0. Determination of the 
regression curve is thus determination of the conditional expectation. 
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Example 14.8: Regression curve for an independent pair 

If the pair {X, Y} is independent, then u = E [Y\X = t]=E [Y], so that the regression curve of 
Y on X is the horizontal line through u = E [Y]. This, of course, agrees with the regression line, 
since Cov [X, Y] = and the regression line is u = + E [Y]. 

The result extends to functions of X and Y Suppose Z = g(X, Y). Then the pair {X, Z} has a joint 
distribution, and the best mean square estimate of Z given X = t is E [Z\X = i\. 

Example 14.9: Estimate of a function of {X, Y} 

Suppose the pair {X, Y} has joint density fxy {t, u) = 60t 2 u for < i < 1, < u < 1 — t. This is 
the triangular region bounded by t = 0, u = 0, and u= 1 — t (see Figure 14.3). Integration shows 
that 



f x (t) = 30i 2 (l - tf, < t < 1 and f Y \ x (u\t) = ^-j on the triangle 

(1 — t) 



Consider 



{ 



X 2 for X < 1/2 



2Y for X > 1/2 
where M = [0, 1/2] and N = (1/2, 1]. Determine E [Z\X = t] 

U 



I M (X) X 2 + I N (X) 2Y 



(14.61) 



(14.62) 




Figure 14.3: The density function for Example 14.9 (Estimate of a function of {X, Y}). 



SOLUTION By linearity and (CE8) (p. 428), 

E [Z\X = t]=E [I M {X) X 2 \\X = t] + E \I N {X) 2Y\\X = t] = I M (t) t 2 + I N (t) 2E [Y\X = t] (14.63) 
Now 



E[Y\X = t}= / uf Y \ x {u\t)du 



I /•!-* 2 (1 — t) 3 2 

2m 2 du= - ■ ± '— = - (1 - t) , < t < 1 (14.64) 



(i-tyjo 



3 (1-t) 2 3 
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so that 

E [Z\X = t] = I M (t) t 2 + I N (t) | (1 - t) (14.65) 

Note that the indicator functions separate the two expressions. The first holds on the interval 
M = [0, 1/2] and the second holds on the interval N = (1/2, 1]. The two expressions t 2 and 
(4/3) (1 — t)must not be added, for this would give an expression incorrect for all t in the range of 
X. 

APPROXIMATION 

tuappr 
Enter matrix [a b] of X-range endpoints [0 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 100 
Enter number of Y approximation points 100 
Enter expression for joint density 60*t . ~2. *u. *(u<=l-t) 
Use array operations on X, Y, PX, PY, t, u, and P 
G = (t<=0.5) .*t."2 + 2*(t>0.5) . *u; 

EZx = sum(G. *P) . /sum(P) ; '/, Approximation 

eZx = (X<=0.5) .*X.~2 + (4/3) * (X>0. 5) . *(1-X) ; '/. Theoretical 
plot (X, EZx, 'k-',X,eZx,'k-. ') 
'/. Plotting details '/. See Figure~14.4 

The fit is quite sufficient for practical purposes, in spite of the moderate number of approximation 
points. The difference in expressions for the two intervals of X values is quite clear. 
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0.7 



Theoretical and approximate regression curves, Example 14.5.2 



ii 
x 

bL 

LU 




0.1 0.2 0.3 



Figure 14.4: Theoretical and approximate regression curves for Example 14.9 (Estimate of a function 
of {X, Y}). 



Example 14.10: Estimate of a function of {X, Y} 

Suppose the pair {X, Y} has joint density fxy {t, u) = | (t 2 + u), on the unit square < t < 1, 
< u < 1 (see Figure 14.5). The usual integration shows 

t 2 + u 



fx{t)=-(2t 2 +l), < i < 1, and f Y \x (u\t) = 2 ^ 2 - - on the square (14.66) 



Consider 



Z={ 



2X 2 for X < Y 
3XY for X > Y 



I Q {X, Y) 2X 2 + I Q c (X, Y) 3Xy, where Q = {(t, u) : u > t} (14.67) 



Determine E[Z\X = i\. 
SOLUTION 



E [Z\X = t] = It 1 \ Iq (t, u) f Y \x («|t) du + 3t / Q e (t, u) uf Y \x (u\t) du (14.68) 



U 



2 r 1 



2t 2 



(i/ 



1 1 ^ + -)du +W ^J Q ^u + u 2 ) 



, 9N -t 5 + 4t 4 + 2t 2 

a„. , „.2\ du = _ — ; o < t < l 



2t 2 + l 



(14.69) 
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fw(t,u) = (6/5)(t" + u) 



Figure 14.5: The density and regions for Example 14.10 (Estimate of a function of {X, Y}) 



Note the different role of the indicator functions than in Example 14.9 (Estimate of a function 
of {X, Y}). There they provide a separation of two parts of the result. Here they serve to set the 
effective limits of integration, but sum of the two parts is needed for each t. 
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Theoretical and approximate regression curves, Example 14.5.3. 
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Figure 14.6: Theoretical and approximate regression curves for Example 14.10 (Estimate of a function 
oi{X, Y}). 



APPROXIMATION 

tuappr 
Enter matrix [a b] of X-range end/points [0 1] 
Enter matrix [c d] of Y-range endpoints [0 1] 
Enter number of X approximation points 200 
Enter number of Y approximation points 200 
Enter expression for joint density (6/5)*(t.~2 
Use array operations on X, Y, PX, PY, t, u, and 
G = 2*t.~2.*(u>=t) + 3*t.*u.*(u<t) ; 
EZx = sum(G.*P) ./sum(P) ; 

eZx = (-X.-5 + 4+X.-4 + 2*X. ~2) . /(2*X. ~2 + 1); 
plot (X, EZx, 'k-',X,eZx,'k-. ') 
7, Plotting details 



u) 



7, Approximate 
'/, Theoretical 

'/„ See Figure" 14. 6 



The theoretical and approximate are barely distinguishable on the plot. Although the same number 
of approximation points are use as in Figure 14.4 (Example 14.9 (Estimate of a function of {X, Y})), 
the fact that the entire region is included in the grid means a larger number of effective points in 
this example. 

Given our approach to conditional expectation, the fact that it solves the regression problem is a matter 
that requires proof using properties of of conditional expectation. An alternate approach is simply to define 
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the conditional expectation to be the solution to the regression problem, then determine its properties. 
This yields, in particular, our defining condition (CE1) (p. 426). Once that is established, properties of 
expectation (including the uniqueness property (E5) ("(E5) ", p. 600)) show the essential equivalence of 
the two concepts. There are some technical differences which do not affect most applications. The alternate 
approach assumes the second moment E [X 2 ~\ is finite. Not all random variables have this property. However, 
those ordinarily used in applications at the level of this treatment will have a variance, hence a finite second 
moment. 

We use the interpretation of e (X) = E[g (Y) \X] as the best mean square estimator of g (Y), given X, 
to interpret the formal property (CE9) (p. 437). We examine the special form 

(CE9a) E{E [g (Y) \X] \X, Z} = E{E [g (Y) \X, Z] \X) = E[g (Y) \X] 

Put ei {X, Z) = E[g(Y) \X,Z], the best mean square estimator of g{Y), given (X, Z). Then (CE9b) 
can be expressed 



E[e{X)\X, Z] = e{X) a.s. and E [e x (X, Z) \X] = e (X) a.s. 



(14.70) 



In words, if we take the best estimate of g(Y), given X, then take the best mean sqare estimate of that, 
given (X, Z), we do not change the estimate of g (Y). On the other hand if we first get the best mean sqare 
estimate of g (Y), given (X, Z), and then take the best mean square estimate of that, given X, we get the 
best mean square estimate of g (Y), given X. 

14.2 Problems on Conditional Expectation, Regression 2 

For the distributions in Exercises 1-3 

a. Determine the regression curve of Y on X and compare with the regression line of Y on X. 

b. For the function Z = g (X, Y) indicated in each case, determine the regression curve of Z on X. 

Exercise 14.1 (Solution on p. 444.) 

(See Exercise 17 (Exercise 11.17) from "Problems on Mathematical Expectation"). The pair {X, Y} 
has the joint distribution (in file npr08_07.m (Section 17.8.38: npr08_07)): 



P(X = t, Y = u) 



(14.71) 



t = 


-3.1 


-0.5 


1.2 


2.4 


3.7 


4.9 


u = 7.5 


0.0090 


0.0396 


0.0594 


0.0216 


0.0440 


0.0203 


4.1 


0.0495 





0.1089 


0.0528 


0.0363 


0.0231 


-2.0 


0.0405 


0.1320 


0.0891 


0.0324 


0.0297 


0.0189 


-3.8 


0.0510 


0.0484 


0.0726 


0.0132 





0.0077 



Table 14.2 

The regression line of Y on X is u = 0.5275i + 0.6924. 

Z = X 2 Y+ \X + Y\ 



(14.72) 



2 This content is available online at <http://cnx.Org/content/m24441/l.4/>. 
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Exercise 14.2 (Solution on p. 444.) 

(See Exercise 18 (Exercise 11.18) from "Problems on Mathematical Expectation"). The pair 
{X, Y} has the joint distribution (in file npr08_08.m (Section 17.8.39: npr08_08)): 



P(X = t, Y = u) 



(14.73) 



t = 


1 


3 


5 


7 


9 


11 


13 


15 


17 


19 


u = 12 


0.0156 


0.0191 


0.0081 


0.0035 


0.0091 


0.0070 


0.0098 


0.0056 


0.0091 


0.0049 


10 


0.0064 


0.0204 


0.0108 


0.0040 


0.0054 


0.0080 


0.0112 


0.0064 


0.0104 


0.0056 


9 


0.0196 


0.0256 


0.0126 


0.0060 


0.0156 


0.0120 


0.0168 


0.0096 


0.0056 


0.0084 


5 


0.0112 


0.0182 


0.0108 


0.0070 


0.0182 


0.0140 


0.0196 


0.0012 


0.0182 


0.0038 


3 


0.0060 


0.0260 


0.0162 


0.0050 


0.0160 


0.0200 


0.0280 


0.0060 


0.0160 


0.0040 


-1 


0.0096 


0.0056 


0.0072 


0.0060 


0.0256 


0.0120 


0.0268 


0.0096 


0.0256 


0.0084 


-3 


0.0044 


0.0134 


0.0180 


0.0140 


0.0234 


0.0180 


0.0252 


0.0244 


0.0234 


0.0126 


-5 


0.0072 


0.0017 


0.0063 


0.0045 


0.0167 


0.0090 


0.0026 


0.0172 


0.0217 


0.0223 



Table 14.3 

The regression line of Y on X is u = — 0.2584t + 5.6110. 

Z = I Q {X,Y)Vx{Y-4) + I Q c{X,Y)XY 2 Q = {{t 7 u) : u < t} 



(14.74) 



Exercise 14.3 (Solution on p. 445.) 

(See Exercise 19 (Exercise 11.19) from "Problems on Mathematical Expectation"). Data were kept 
on the effect of training time on the time to perform a job on a production line. X is the amount 
of training, in hours, and Y is the time to perform the task, in minutes. The data are as follows 
(in file npr08_09.m (Section 17.8.40: npr08_09)): 



P(X = t, Y = u) 



(14.75) 



t = 


1 


1.5 


2 


2.5 


3 


u = 5 


0.039 


0.011 


0.005 


0.001 


0.001 


4 


0.065 


0.070 


0.050 


0.015 


0.010 


3 


0.031 


0.061 


0.137 


0.051 


0.033 


2 


0.012 


0.049 


0.163 


0.058 


0.039 


1 


0.003 


0.009 


0.045 


0.025 


0.017 



The regression line of Y on X is u 



Table 14.4 

-0.7793* + 4.3051. 

= (Y-2.8)/X 



(14.76) 



For the joint densities in Exercises 4-11 below 
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a. Determine analytically the regression curve of Y on X and compare with the regression line of Y on 
X. 

b. Check these with a discrete approximation. 

Exercise 14.4 (Solution on p. 445.) 

(See Exercise 10 (Exercise 8.10) from "Problems On Random Vectors and Joint Distributions", 
Exercise 20 (Exercise 11.20) from "Problems on Mathematical Expectation", and Exercise 23 (Ex- 
ercise 12.23) from "Problems on Variance, Covariance, Linear Regression"). fxY(t,u) = 1 for 
< t < 1, < u< 2(1 -t). 

The regression line of Y on X is u = 1 — t. 

f x (t) = 2(1 -t), < i < 1 (14.77) 

Exercise 14.5 (Solution on p. 445.) 

(See Exercise 13 (Exercise 8.13) from " Problems On Random Vectors and Joint Distributions", 
Exercise 23 (Exercise 11.23) from "Problems on Mathematical Expectation", and Exercise 24 (Ex- 
ercise 12.24) from "Problems on Variance, Covariance, Linear Regression"), fxy (t,u) = g (t + u) 
for < t < 2, < u < 2. 

The regression line of Y on X is u = —t/11 + 35/33. 

fx{t) = \{t+l), 0<i<2 (14.78) 

Exercise 14.6 (Solution on p. 446.) 

(See Exercise 15 (Exercise 8.15) from "Problems On Random Vectors and Joint Distributions", 
Exercise 25 (Exercise 11.25) from "Problems on Mathematical Expectation", and Exercise 25 
(Exercise 12.25) from "Problems on Variance, Covariance, Linear Regression"). Jxy {t,u) = 
j| (2t + 3w 2 ) for < t < 2, < u < 1 + 1. 

The regression line of Y on X is u = 0.0958i + 1.4876. 

fx (t) =^-(l + t)(l + 4t + t 2 ) = ^-(l + 5t+5t 2 + t 3 ), 0<t<2 (14.79) 

50 80 

Exercise 14.7 (Solution on p. 446.) 

(See Exercise 16 (Exercise 8.16) from " Problems On Random Vectors and Joint Distributions", 
Exercise 26 (Exercise 11.26) from "Problems on Mathematical Expectation", and Exercise 26 (Ex- 
ercise 12.26) from "Problems on Variance, Covariance, Linear Regression"), fxy (t,u) = 12t 2 u on 
the parallelogram with vertices 

(-1,0), (0,0), (1,1), (0,1) (14.80) 

The regression line of Y on X is u = {At + 5) /9. 

fx (t) = J[_i, ] (t) 6t 2 (t + I) 2 + J (0il] (t) U 2 (1 - t 2 ) (14.81) 

Exercise 14.8 (Solution on p. 446.) 

(See Exercise 17 (Exercise 8.17) from " Problems On Random Vectors and Joint Distributions", 
Exercise 27 (Exercise 11.27) from "Problems on Mathematical Expectation", and Exercise 27 (Ex- 
ercise 12.27) from "Problems on Variance, Covariance, Linear Regression"). fxY (t,u) = \\tu for 
< t< 2, < u< min{l,2-t}. 
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The regression line of Y on X is u = (-124* + 368) /431 

fx (t) = / [0 ,i] (t) Y x t + J (1>2] (i) i|t(2 - i) 2 (14.82) 

Exercise 14.9 (Solution on p. 447.) 

(See Exercise 18 (Exercise 8.18) from " Problems On Random Vectors and Joint Distributions", 
Exercise 28 (Exercise 11.28) from "Problems on Mathematical Expectation", and Exercise 28 (Exer- 
cise 12.28) from "Problems on Variance, Covariance, Linear Regression"), fxy (t,u) = ^ (t + 2u) 
for < t < 2, < u < max{2 - t, t}. 

The regression line of Y on X is u = 1.0561£ — 0.2603. 

fx (*) = / [0 ,i] (*) ^ (2 " i} + 1{ ^ {t) y* (i483) 

Exercise 14.10 (Solution on p. 447.) 

(See Exercise 21 (Exercise 8.21) from " Problems On Random Vectors and Joint Distributions", 
Exercise 31 (Exercise 11.31) from "Problems on Mathematical Expectation", and Exercise 29 (Exer- 
cise 12.29) from "Problems on Variance, Covariance, Linear Regression"), fxy {t, u) = j% (t + 2m), 
for < t < 2, < u < min{2t, 3 - t). 

The regression line of Y on X is u = -0.1359t + 1.0839. 

fx (t) = I [0l i] (t) ^t 2 + J (lj2] (t) ^ (3 - t) (14.84) 

Exercise 14.11 (Solution on p. 447.) 

(See Exercise 22 (Exercise 8.22) from " Problems On Random Vectors and Joint Distributions", 
Exercise 32 (Exercise 11.32) from "Problems on Mathematical Expectation", and Exercise 30 
(Exercise 12.30) from "Problems on Variance, Covariance, Linear Regression"), fxy {t,u) = 

I M (t)|(t 2 + 2u) + I (1 , 2] (t)|tV, 
for < u < 1. 
The regression line of Y on X is u = 0.08m + 0.5989. 



fx (t) = / [0 ,i] (*) I (t 2 + 1) + J (1 , 2] (t) ^t 2 (14.85) 

For the distributions in Exercises 12-16 below 

a. Determine analytically E [Z\X = t] 

b. Use a discrete approximation to calculate the same functions. 

Exercise 14.12 (Solution on p. 448.) 

f XY (t,u) = ^ (2i + 3w 2 ) for < t < 2, < u < 1 + t (see Exercise 37 (Exercise 11.37) from 
"Problems on Mathematical Expectation", and Exercise 14.6). 

fx (t) = ^ (1 + *) (1 + 4* + t 2 ) = ^ (1 + 5t + hi 2 + t 3 ) , < t < 2 (14.86) 

so so 

Z = l m (X) AX + I (h2] (X) (X + Y) (14.87) 
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Exercise 14.13 (Solution on p. 448.) 

Jxy (t,u) = jjtu for < t < 2, < u < min{l,2 - t] (see Exercise 38 (Exercise 11.38) from 
"Problems on Mathematical Expectaton", Exercise 14.8). 

fx (t) = / [0 ,i] (t) ^t + J (1>2] (t) ^(2 - tf (14.88) 

Z = I M (X, Y) hi + I M c (X, Y)Y 2 , M = {(t,u):u>t} (14.89) 

Exercise 14.14 (Solution on p. 448.) 

fxv (t, u) = ^ (t + 2u) for < t < 2, < u < max{2 - t, t] (see Exercise 39 (Exercise 11.39), and 
Exercise 14.9). 

fx (t) = J [0l i] (t) ^ (2 - t) + /(i, 2] (t) ^t 2 (14.90) 

Z = I M {X, Y) {X + Y) + I M c {X, Y) 2Y, M = {(t,u) : max(t,u) < 1} (14.91) 

Exercise 14.15 (Solution on p. 449.) 

f XY (t,u) = ^ (t+2u), for < t < 2, < u < min{2t,S - t}. (see Exercise 31 (Exercise 11.31) 
from "Problems on Mathematical Expectation", and Exercise 14.10). 

fx (t) = l m (t) ^t 2 + / ( i, 2] (t) ^ (3 - t) (14.92) 

Z = I M {X,Y){X + Y) + I MI :{X,Y)2Y 2 , M = {(t,u) : t < 1, u > 1} (14.93) 

Exercise 14.16 (Solution on p. 449.) 

f XY (t,u) = 7 [0)1 ] (t) | (t 2 + 2u) + 7 (1 , 2 ] (t) jit 2 u 2 , for < u < 1. (see Exercise 32 (Exercise 11.32) 
from "Problems on Mathematical Expectation", and Exercise 14.11). 

fx (t) = /[„,!] (t) | (i 2 + 1) + 7(i,2] (t) ^ 2 (14.94) 

Z = I M (X,Y)X + I M °(X,Y)XY, M = {(t,u) : u < min (1,2- t)} (14.95) 

Exercise 14.17 (Solution on p. 450.) 

Suppose X ~ uniform on through n and V ~ conditionally uniform on through i, given X = i. 

a. Determine E [Y] from E [Y\X = i\. 

b. Determine the joint distribution for {X, Y} for n = 50 (see Example 7 (Example 14.7: A 
random number JV of Bernoulli trials) from "Conditional Expectation, Regression" for a 
possible approach). Use jcalc to determine E [Y]; compare with the theoretical value. 

Exercise 14.18 (Solution on p. 450.) 

Suppose X ~ uniform on 1 through n and Y ~ conditionally uniform on 1 through i, given X = i. 

a. Determine E [Y] from E [Y\X = i\. 

b. Determine the joint distribution for {X, Y} for n = 50 (see Example 7 (Example 14.7: A 
random number JV of Bernoulli trials) from "Conditional Expectation, Regression" for a 
possible approach). Use jcalc to determine E [Y]; compare with the theoretical value. 
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Exercise 14.19 (Solution on p. 451.) 

Suppose X ~ uniform on 1 through n and Y ~ conditionally binomial (i,p), given X = i. 

a. Determine E [Y] from E [Y\X = k]. 

b. Determine the joint distribution for {X, Y} for n = 50 and p = 0.3. Use jcalc to determine 
E [Y]; compare with the theoretical value. 

Exercise 14.20 (Solution on p. 451.) 

A number X is selected randomly from the integers 1 through 100. A pair of dice is thrown X 

times. Let Y be the number of sevens thrown on the X tosses. Determine the joint distribution 
for {X, Y} and then determine E [Y]. 

Exercise 14.21 (Solution on p. 451.) 

A number X is selected randomly from the integers 1 through 100. Each of two people draw X 

times, independently and randomly, a number from 1 to 10. Let Y be the number of matches (i.e., 

both draw ones, both draw twos, etc.). Determine the joint distribution and then determine E [Y]. 

Exercise 14.22 (Solution on p. 452.) 

E [Y\X = t] = lOt and X has density function f x (t) = 4 - It for 1 < t < 2. Determine E [Y]. 

Exercise 14.23 (Solution on p. 452.) 

E[Y\X = t] = 1(1 ~t) for < t < 1 and X has density function f x (i) = 30i 2 (l - tf for 
< t< 1. Determine E[Y]. 

Exercise 14.24 (Solution on p. 452.) 

E [Y\X = t] = | (2 - t) and X has density function f x (t) = yf i 2 (2 - tf < t < 2. Determine 
E[Y\. 

Exercise 14.25 (Solution on p. 452.) 

Suppose the pair {X, Y} is independent, with X ~ Poisson (/i) and Y ~ Poisson (A). Show that 
X is conditionally binomial (n, /i/ (fi + A)), given X + Y = n. That is, show that 

P(X= k\X + Y = n) = C(n,k) P k {l-p) n ~ k , < k < n, for p = fx/ (fj, + A) (14.96) 

Exercise 14.26 (Solution on p. 452.) 

Use the fact that g (X, Y) = g* (X, Y, Z), where g* (t, u, v) does not vary with v. Extend property 
(CE10) to show 

E [g (X, Y)\X = t,Z = v]=E[g (t, Y)\X = t,Z = v] a.s. [P xz ] (14.97) 

Exercise 14.27 (Solution on p. 452.) 

Use the result of Exercise 14.26 and properties (CE9a) ("(CE9a)", p. 601) and (CE10) to show 
that 

E [g (X, Y)\Z = v]= J E[g (t, Y)\X = t,Z= v] F x]z (dt\v) a.s. [P z ] (14.98) 

Exercise 14.28 (Solution on p. 452.) 

A shop which works past closing time to complete jobs on hand tends to speed up service on any 
job received during the last hour before closing. Suppose the arrival time of a job in hours before 
closing time is a random variable T ~ uniform [0, 1]. Service time Y for a unit received in that 
period is conditionally exponential (3 (2 — u), given T = u. Determine the distribution function for 
Y 
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Exercise 14.29 (Solution on p. 453.) 

Time to failure X of a manufactured unit has an exponential distribution. The parameter is 
dependent upon the manufacturing process. Suppose the parameter is the value of random variable 
H ~ uniform on[0.005, 0.01], and X is conditionally exponential (u), given H = u. Determine 
P (X > 150). Determine E [X\H = u] and use this to determine E [X]. 

Exercise 14.30 (Solution on p. 453.) 

A system has n components. Time to failure of the ith component is X; and the class 

{Xi : 1 < i < n} is iid exponential (A). The system fails if any one or more of the components 
fails. Let W be the time to system failure. What is the probability the failure is due to the ith 
component? 

Suggestion. Note that W = X % iff X, > X t for all j / i. Thus 

{W = X i } = {(X 1 ,X 2 ,---,X n )€Q}, Q = {(t u t2,---t n ):t k >ti, Vfc/z} (14.99) 

P(W = X i )=E[I Q (X 1 ,X 2 ,--- ,X n )]=E{E[I Q (X 1 ,X 2 ,--- ,X n )\X t ]} (14.100) 
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Solutions to Exercises in Chapter 14 



Solution to Exercise 14.1 (p. 437) 

The regression line of Y on X is u = 0.5275£ + 0.6924. 

npr08_07 (Section" 17. 8 . 38: npr08_07) 
Data are in X, Y, P 
jcalc 



EYx = sum(u. 


*P) ./sum(P) ; 


disp([X;EYx] 


') 


-3.1000 


-0.0290 


-0.5000 


-0.6860 


1.2000 


1.3270 


2.4000 


2.1960 


3.7000 


3.8130 


4.9000 


2.5700 



G = t.~2.*u + abs(t+u); 
EZx = sum(G.*P) ./sum(P) ; 
disp([X;EZx] ') 



3. 


,1000 


4. 


,0383 


0. 


.5000 


3. 


,5345 


1. 


,2000 


6. 


.0139 


2. 


,4000 


17, 


.5530 


3. 


,7000 


59. 


.7130 


4. 


,9000 


69. 


.1757 



Solution to Exercise 14.2 (p. 437) 

The regression line of Y on X is u = — 0.2584£ + 5.6110. 

npr08_08 (Section~17. 8 . 39: npr08_08) 
Data are in X, Y, P 
jcalc 



EYx = sum(u. 


*P) ./sum(P) ; 




disp([X;EYx] 


') 




1.0000 


5.5350 




3.0000 


5.9869 




5.0000 


3.6500 




7.0000 


2.3100 




9.0000 


2.0254 




11.0000 


2.9100 




13.0000 


3.1957 




15.0000 


0.9100 




17.0000 


1.5254 




19.0000 


0.9100 




M = u<=t; 






G = (u-4) . *s 


,qrt(t) . *M + t.*u. 


~2.*(1-M); 


EZx = sum(G. 


*P) ./sum(P) ; 




disp([X;EZx] 


') 




1.0000 


58.3050 




3.0000 


166.7269 




5.0000 


175.9322 
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7.0000 185.7896 

9.0000 119.7531 

11.0000 105.4076 

13.0000 -2.8999 

15.0000 -11.9675 

17.0000 -10.2031 

19.0000 -13.4690 

Solution to Exercise 14.3 (p. 438) 
The regression line of Y on X is u = -0.7793i + 4.3051. 

npr08_09 (Sect ion" 17. 8 .40: npr08_09) 
Data are in X, Y, P 
jcalc 



EYx = sum(u. 


*P) ./sum(P) ; 


disp([X;EYx] 


') 


1.0000 


3.8333 


1.5000 


3.1250 


2.0000 


2.5175 


2.5000 


2.3933 


3.0000 


2.3900 



G = (u - 2.8) ./t; 

EZx = sum(G.*P) ./sum(P) ; 

disp([X;EZx] ') 

1.0000 1.0333 

1.5000 0.2167 

2.0000 -0.1412 

2.5000 -0.1627 

3.0000 -0.1367 

Solution to Exercise 14.4 (p. 439) 

The regression line of Y on X is u = 1 — t. 



fv\x{u\t)= J" , < i < 1, 0<u<2(l-t) (14.101) 



2(1-*) 



1 



2(l-i) 



E [Y\X = t] = — -/ udu=l-t, < t < 1 (14.102) 

2(1 — t) Jo 



tuappr: [0 1] [0 2] 200 400 u<=2*(l-t) 



EYx = sum(u.*P) ./sum(P) ; 

plot (X, EYx) '/. Straight line thru (0,1), (1,0) 

Solution to Exercise 14.5 (p. 439) 
The regression line of Y on X is u = — 1/11 + 35/33. 



jt + u) 
2(t+l) 



f Ylx ( u \t) = n \ x ; ^ <t<2, 0<w<2 (14.103) 



If 1 

. / (tu + u 2 ) du=l-\ 

2 (t + 1) Jo V ; 3* + 3 



E[Y\X = t] = :rr ^ : — -/ (tu + u 2 ) du = 1 + nM \ 0<t<2 (14.104) 

1) Jo 
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tuappr: [0 2] [0 2] 200 200 (l/8)*(t+u) 
EYx = sum(u.*P) ./sum(P) ; 
eyx = 1 + l./(3*X+3); 
plot (X, EYx, X,eyx) % Plots nearly indistinguishable 

Solution to Exercise 14.6 (p. 439) 

The regression line of Y on X is u = 0.0958£ + 1.4876. 

Mx (u\t) = {l + ^l U l t + t2) 0<u<l + t (14.105) 

E[Y\X = t}= {l + t){l 1 +At+t2) J"' (2*« + 3. 3 ) du (14.106) 

= (t + l)(t + 3)(3t + l) Q<t<2 

4(1 + 4t + 1 2 ) ' ~ ~ v ' 

tuappr: [0 2] [0 3] 200 300 (3/88)*(2*t + 3*u. ~2) . * (u<=l+t) 
EYx = sum(u.*P) ./sum(P) ; 

eyx = (X+1).*(X+3).*(3*X+1)./(4*(1 + 4*X + X.~2)); 
plot (X, EYx, X, eyx) '/, Plots nearly indistinguishable 

Solution to Exercise 14.7 (p. 439) 

The regression line of Y on X is u = (23* + 4) /18. 

2u 2u 

Iy\x (u\t) = I[-i,o\ (*) -2 + I(o,i] (t) T -~r on the parallelogram (14.108) 

(i + 1) (1-* ) 

I /•*+! i r 1 



E [Y\X = t} = I[_ 1)0] (i) -J / 2u du + 7(0,1] (*) -jr / 2u du (14.109) 



—,J q 2udu + I {QA] (t)^- W) 



= /[_ li0] (*) §(* + !) + 7 (o,i] (*) I - ^l\ l ( 14 -H0) 

tuappr: [-1 1] [0 1] 200 100 12*t . ~2 . *u. * ( (u<= min(t+l , 1) )&(u>=max(0,t) )) 
EYx = sum(u.*P) ./sum(P) ; 
M = X<=0; 

eyx = (2/3)*(X+l).*M + (2/3) *(1-M) . * (X. ~2 + X + 1)./(X + 1); 
plot (X, EYx, X, eyx) '/, Plots quite close 

Solution to Exercise 14.8 (p. 439) 

The regression line of Y on X is u = (—124* + 368) /431. 

2w 

fy\ X (U\t) = J [0 ,i] (*) 2u + / (1>2] (*) -2 (14.111) 



V 
1 


~tf 

/■2-t 


(2- 


tf Jo 



r 1 i r 2- * 

£ [Y|X = *] = J [0>1] (*) / 2m 2 du + I ( i i2 ] (t) 2/ 2u 2 du (14.112) 

Jo (2 — *) Jo 

= /[o,i] (t) I + I(i,2] (t) I (2 - *) (14.113) 
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tuappr: [0 2] [0 1] 200 100 (24/11) *t . *u. * (u<=min(l,2-t)) 
EYx = sum(u.*P) ./sum(P) ; 
M = X <= 1; 

eyx = (2/3) *M + (2/3).* (2 - X).*(l-M); 
plot (X, EYx, X,eyx) '/, Plots quite close 

Solution to Exercise 14.9 (p. 440) 
The regression line of Y on X is u = 1.0561£ — 0.2603. 

h\x Mi) = /[o,i] (*) 2(2 2 -t) + /(1 ' 2] {t) t ~W L ° " U ~ maX (2 " *' t] (14 - U4) 

E [Y\X = t] = I [0>1] (t) - J_ J (tu + 2u 2 ) du + I (1;2] (t) ^ J (tu + 2u 2 ) du (14.115) 

= '[0,1] (t) ^ (t ~ 2) (t - 8) + I (1 , 2] (t) ^t (14.116) 

tuappr: [0 2] [0 2] 200 200 (3/23) * (t+2*u) . * (u<=max(2-t ,t) ) 
EYx = sum(u.*P) ./sum(P) ; 
M = X<=1; 

eyx = (l/12)*(X-2) .*(X-8) .*M + (7/12) *X. * (1-M) ; 
plot (X, EYx, X, eyx) '/, Plots quite close 

Solution to Exercise 14.10 (p. 440) 
The regression line of Y on X is u = -0.1359* + 1.0839. 

f Y \ x (t\u) = l m (t) f -^ + I (1>2] (t) *3^" < u < max (2t, 3 - t) (14.117) 

i r* , ,x , . , , i r3 - f 



E[Y\X = t} = I [0A] (t) — J (tu + 2u 2 )du+I {h2] (t)——J (tu+2u 2 )du (14.118) 



11. - -- 1 ui 



/[o,i] (*) y* + J(i,2] (*) jg (* 2 - 15* + 36) (14.119) 



tuappr: [0 2] [0 2] 200 200 (2/13)* (t+2*u) . *(u<=min(2*t ,3-t)) 
EYx = sum(u.*P) ./sum(P) ; 
M = X<=1; 

eyx = (11/9)*X.*M + (1/18)*(X.~2 - 15*X + 36).*(1-M); 
plot (X, EYx, X, eyx) '/, Plots quite close 

Solution to Exercise 14.11 (p. 440) 
The regression line of Y on X is u = 0.0817i + 0.5989. 



t 2 + 2u 
h\x (t\u) = / [0 ,i] (t) -^— + /(i,2] (t) 3w 2 < u < 1 (14.120) 






[Y|X = *] = J [0;1] (t) -2— - / (t 2 u + 2u 2 ) du + J (li2] (t) / 3w 3 d M (14.121) 

= / ^] (i) 6^TiT + /(1 ' 2l(i) 4 (14 - 122) 
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tuappr: [0 2] [0 1] 200 100 (3/8)*(t.~2 + 2*u).*(t<=l) + ... 
(9/14)*t.~2.*u.~2.*(t>l) 
EYx = sum(u.*P) ./sum(P) ; 
M = X<=1; 

eyx = M.*(3*X.~2 + 4)./(6*(X.~2 + 1)) + (3/4) *(1 - M) ; 
plot (X, EYx, X,eyx) '/, Plots quite close 

Solution to Exercise 14.12 (p. 440) 

Z = I M (X) 4X + I N (X) (X + Y), Use of linearity, (CE8) ("(CE8)", p. 601), and (CE10) ("(CE10)", p. 
601) gives 

E [Z\X = t]=I M (t) 4£ + I N (t) {t+E [Y\X = t]) (14.123) 

'/, Continuation of Exercise" 14. 6 
G = 4*t.*(t<=l) + (t + u).*(t>l); 
EZx = sum(G.*P) ./sum(P) ; 
M = X<=1; 

ezx = 4*X.*M + (X + (X+l) .*(X+3) .*(3*X+1) ./(4*(1 + 4*X + X. ~2)) ) . * (1-M) ; 
plot (X, EZx, X,ezx) °/ Plots nearly indistinguishable 

Solution to Exercise 14.13 (p. 440) 

Z = Im(X,Y)^X + I M c(X 1 Y)Y 2 , M = {(t,u):u>t} (14.125) 

I M (t, u) = I [0>1] (t) I [ttl] («) I M c (t, u) = I m (t) I [0;t] («) + 7 (li2] (t) /[0,2-t] («) (14.126) 

E [Z\X = t] = l m (t) 



t f 1 i 

— I 2udu + / u ■ 2udu 



2 ~* 2w 

+ / (l,2]( i )/ u2 -7Z 72 du ( 14 " 127 ) 

o (2 — t) 



= / [0jl] (t) l -t (1 - t 2 + t 3 ) + I (1;2] (t) l -{2 - tf (14.128) 

'/, Continuation of Exercise" 14. 6 
Q = u>t; 

G = (l/2)*t.*Q + u.~2.*(l-Q); 
EZx = sum(G.*P) ./sum(P) ; 
M = X <= 1; 

ezx = (1/2)*X.*(1-X.~2+X.~3) .*M + (1/2) * (2-X) . ~2. * (1-M) ; 
plot (X, EZx, X, ezx) 7, Plots nearly indistinguishable 

Solution to Exercise 14.14 (p. 441) 

Z = I M (X,Y)(X + Y) + I M c(X,Y)2Y, M = {(t,u) : max(t,u) < 1} (14.129) 

I M (t, u) = I[ ,i] (t) I[ ,i] (u) I M - (t, u) = I[ ,i] (t) I[i,2-t] (w) + ^(1,2] (*) I[o,t] i u ) (14.130) 
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E [Z\X = t] = / [0l i] (t) 2(^y Jq 1 (t + u)(t + 2u) du + j±q \l t u(t + 2u) du 

I (lt2] (t)2E[Y\X = t] 



ho,i] (*) 



1 2i d - 30t 2 + 69t - 60 
12 i-2 



+ 7 (1)2] (t) -2t 



(14.131) 



(14.132) 



7, Continuation of Exercise" 14. 9 
M = X <= 1; 
Q = (t<=l)ft(u<=l); 
G = (t+u).*Q + 2*u.*(l-Q); 
EZx = sum(G.*P) ./sum(P) ; 

ezx = (1/12)*M.*(2*X.~3 - 30*X.~2 + 69*X -60)./(X-2) + (7/6) *X. * (1-M) ; 
plot (X, EZx, X.ezx) 

Solution to Exercise 14.15 (p. 441) 

Z = I M {X,Y){X + Y) + I M .{X,Y)2Y 2 , M = {(t,u) : t < 1, u > 1} 



(14.133) 



I M {t, u) = I[o,i] (i) I[i,2] (u) 1m- {t, u) = 7 [0; i] (i) 7 [0i i) (w) + I ( i, 2 ] (*) ^"[0,3-t] («) 



£ [z\X = t} = / [0 ,i/2] (i) ^ / 2w 2 (t + 2«) du+ 



21, 



^(1/2,1] (0 



1 l I 

-^ / 2u 2 (t + 2u) du + 
3* Jo 



(i/ 2 



(i + u) (t + 2u) du 



32 
^[0,1/2] (*) y^ 2 + 1(1/2,1] 0) 3(i 



6i 2 Ji 

1 80i 3 - 6£ 2 - 5£ + 2 



+ '(1,2] (t) 



3 (3 - t) 7 



(14.134) 



(14.135) 



2u 2 (i + 2m) dw 
(14.136) 



/, 2 



+ -f(i,2] (*) - (~t 3 + 15i 2 - 63* + 81) (14.137) 

9 



tuappr: [0 2] [0 2] 200 200 (2/13)*(t + 2*u) . * (u<=min(2*t ,3-t) ) 
M = (t<=l)&(u>=l); 
Q = (t+u).*M + 2*(1-M) .*u.~2; 
EZx = sum(Q.*P) ./sum(P) ; 
Nl = X <= 1/2; 
N2 = (X > 1/2)&(X<=1); 
N3 = X > 1; 
ezx = (32/9)*Nl.*X.~2 + (1/36) *N2 . * (80*X. "3 - 6*X.~2 - 5*X + 2)./X.~2 ... 

+ (1/9)*N3.*(-X.~3 + 15*X.~2 - 63. *X + 81); 
plot (X, EZx, X, ezx) 

Solution to Exercise 14.16 (p. 441) 

Z = I M (X,Y)X + I M c {X,Y)XY 7 M = {(t,u) : u < min (1,2-*)} 



1 +3 



t 6 + 2tu 



E[\X = t]=I m (t) -p — du + I (1 , 2] (t) 



i-t ,-i 

Mu du + I 3tu du 

I) J2-t 



(14.138) 
(14.139) 
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J [0 ,i] (t) t + I (h2] (t) [~t+ 12t 2 - 12i 3 + 5i 4 - ^-t b ) (14.140) 



tuappr: [0 2] [0 1] 200 100 (t<=l) . * (t . ~2 + 2*u)./(t.~2 + 1) +3*u. ~2. * (t>l) 
M = u<=min(l,2-t) ; 
G = M.*t + (1-M) .*t.*u; 
EZx = sum(G.*P) ./sum(P) ; 
N = X<=1; 

ezx = X.+N + (1-N).*(-(13/4)*X + 12*X.~2 - 12*X.~3 + 5*X.~4 - (3/4)*X.~5); 
plot (X, EZx, X,ezx) 



Solution to Exercise 14.17 (p. 441) 

a. E[Y\X = i] = i/2, so 



n 1 n 

E [Y] = V E \Y\X = i]P(X = i) = V i/2 = n/4 (14.141) 

/-^ n _|_ i /-^ 

i=0 i=l 

b.P(X = i) = l/(n+l), < i < n, P (Y = k\X = i) = 1/ (i + 1) , < k < i; hence 
P (X = i, Y = k) = 1/ (n + 1) (i + 1) < i < n, < k < i. 

n = 50; X = 0:n; Y = 0:n; 
P0 = zeros (n+l,n+l) ; 
for i = 0:n 

P0(i+l,l:i+l) = (l/((n+l)*(i+l)))*ones(l,i+l); 
end 

P = rot90(P0); 
jcalc: X Y P 



EY = dot(Y,PY) 

EY = 12.5000 °/. Comparison with part (a): 50/4 = 12.5 

Solution to Exercise 14.18 (p. 441) 

a. E[Y\X = i] = {i+ 1) /2, so 



1 4- 1 4- "? 

E[Y] = Y,E\Y\X = i]P(X = i) = ^ lJ r = !1 i A ( 14 " 142 ) 

b. P(X = i) = l/n, 1 < i < n, P (Y = k\X = i) = l/i, 1 < k < i; hence P (X = i, Y = k) = l/ni 1 < 
i < n, 1 < k < i. 

n = 50; X = l:n; Y = l:n; 
P0 = zeros(n,n) ; 
for i = l:n 

P0(i,l:i) = (l/(n*i))*ones(l,i); 
end 

P = rot90(P0); 
jcalc: P X Y 



EY = dot(Y,PY) 

EY = 13.2500 '/. Comparison with part (a): 53/4 = 13.25 
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Solution to Exercise 14.19 (p. 442) 

a. E[Y\X = i] = ip, so 

n n , . -I \ 

E[Y]=J2E[Y\X = i}P(X = i) = Pj2 i =^r R ( 14A43 ) 

*—* n ^-^ 2 

b. P (X = i) = l/n, 1 < i < n, P (Y = k\X = i) = ibinom (i,p,0:i), < k < i. 

n = 50; p = 0.3; X = l:n; Y = 0:n; 
P0 = zeros(n,n+l) ; 7, Could use randbern 
for i = l:n 

P0(i,l:i+1) = (l/n)*ibinom(i,p,0:i) ; 
end 

P = rot90(P0); 
jcalc: X Y P 



EY = dot(Y.PY) 

EY = 7.6500 '/. Comparison with part (a): 0.3*51/2 = 0.765 

Solution to Exercise 14.20 (p. 442) 

a. P (X = i) = l/n, E [Y\X = i] = i/6, so 

E [Y] = \J2i/n=^± (14.144) 

b. 

n = 100; p = 1/6; X = l:n; Y = 0:n; PX = (l/n) *ones(l ,n) ; 
P0 = zeros(n,n+l) ; 7, Could use randbern 
for i = l:n 

P0(i,l:i+1) = (l/n)*ibinom(i,p,0:i) ; 
end 

P = rot90(P0); 
jcalc 

EY = dot(Y,PY) 
EY = 8.4167 7. Comparison with part (a): 101/12 = 8.4167 

Solution to Exercise 14.21 (p. 442) 

Same as Exercise 14.20, except p = 1/10. E [Y] = (n + 1) /20. 

n = 100; p = 0.1; X = l:n; Y = 0:n; PX = (l/n) *ones(l,n) ; 
P0 = zeros(n,n+l) ; 7. Could use randbern 

for i = l:n 

P0(i,l:i+1) = (l/n)*ibinom(i,p,0:i) ; 
end 

P = rot90(P0); 
jcalc 



EY = dot(Y,PY) 

EY = 5.0500 7. Comparison with part (a): EY = 101/20 = 5.05 
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Solution to Exercise 14.22 (p. 442) 

E[Y]= IE [Y\X = t] f x (t) dt= J 10* (4 - 2i) dt = 40/3 (14.145) 

Solution to Exercise 14.23 (p. 442) 

E[Y]= IE [Y\X = t] fx (t) dt= 20t 2 (l - tf dt = 1/3 (14.146) 

Solution to Exercise 14.24 (p. 442) 

E[Y}= IE [Y\X = t] fx (t) dt= - J t 2 {2 - tf dt = 2/3 (14.147) 

Solution to Exercise 14.25 (p. 442) 

X ~ Poisson (fi), Y ~ Poisson (A), Use of property (Tl) ("(Tl)", p. 387) and generating functions shows 
that X + Y ~ Poisson (fi + A) 



" M TT e " A (^ n! M fe A"- fe 



e -(M+A) (m+a)" k\{n-k)\ (n + X) n 

Put p = [if (n + A) and g = 1 — p = A/ (n + A) to get the desired result. 
Solution to Exercise 14.26 (p. 442) 



By Exercise 14.26, 



Solution to Exercise 14.28 (p. 442) 

o 



,e^-l 



pv 



8v 



(14.149) 



E [g (X, Y)\X = t,Z = v} = E [g* (X, Z, Y) \ (X, Z) = (t, v)} = E [g* (t, v, Y) \ (X, Z) = (t, v)] (14.150) 

= E[g (t, Y)\X = t,Z = v] a.s. [P xz ] by (CE10) (14.151) 

Solution to Exercise 14.27 (p. 442) 

By (CE9) ("(CE9)", p. 601), E [g (X, Y) \Z] = E{E [g (X, Y) \X, Z] \Z} = E[e (X, Z) \Z\ a.s. 
By (CE10), 

E [e {X, Z)\Z = v] = E [e {X, v) \Z = v] = (14.152) 

e{t,v)F xlz {dt\v) a.s. (14.153) 



J E [g (X, Y)\X = t,Z= v] F X \z (dt\v) = (14.154) 

J E[g (t, Y)\X = t,Z = v] F X \z (dt\v) a.s. [P z ] (14.155) 



F Y (v) = / F Y \ T (v\u) f T («) du= (l- e -^ 2 - u >^j du = (14.156) 

1-e-^' 



, < v (14.157) 
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Solution to Exercise 14.29 (p. 443) 

F x \h (t\u) = 1 - e~ ut f H («) = :r^r = 200, 0.005 < u < 0.01 (14.158) 

0.005 

,■0.01 r)r.r. 

F x (t) = 1 - 200 / e- u * cfa = 1 - — [ e - 0005 * - e" 001 *] (14.159) 

J 0.005 * 

P (X > 150) = — [e~ - 75 - e -1 - 6 ] « 0.3323 (14.160) 

£ [X|JT = m] = 1/ti £[X] = 200/ — = 200ln2 (14.161) 

JO. 005 u 

Solution to Exercise 14.30 (p. 443) 

Let Q = {(ii, t 2 , ■■ ■ , t n ) :t k >U, k^ i}. Then 

P (W = Xi) = E [I Q (X u X 2 , ■■■ , X n )\ = E{E [I Q (X 1: X 2 , ■■■ , X n ) \X t ]} (14.162) 

= J E[I Q (X l7 X 2 , ■■■ ,tt, ■■■ X n )\ F x (dt) (14.163) 

E [I Q (X u X 2 , ■ ■ ■ , U, ■■■ X n )} = Y[P(X k >t) = [l-F x (t)}"- 1 (14.164) 

If Fx is continuous, strictly increasing, zero for t < 0, put u = F x {t), du = f x (t) dt. t = ~ u = 0, 
t = oo ~ u = 1. Then 

P(W = Xi)= [ {l-u) n ~ 1 du= [ u n ~ l du=l/n (14.165) 
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Chapter 15 

Random Selection 

15.1 Random Selection 1 

15.1.1 Introduction 

The usual treatments deal with a single random variable or a fixed, finite number of random variables, 
considered jointly. However, there are many common applications in which we select at random a member of 
a class of random variables and observe its value, or select a random number of random variables and obtain 
some function of those selected. This is formulated with the aid of a counting or selecting random variable 
JV, which is nonegative, integer valued. It may be independent of the class selected, or may be related in 
some sequential way to members of the class. We consider only the independent case. Many important 
problems require optional random variables, sometimes called Markov times. These involve more theory 
than we develop in this treatment. 
Some common examples: 

1. Total demand of JV customers — JV independent of the individual demands. 

2. Total service time for JV units — JV independent of the individual service times. 

3. Net gain in JV plays of a game — JV independent of the individual gains. 

4. Extreme values of JV random variables — JV independent of the individual values. 

5. Random sample of size JV — JV is usually determined by propereties of the sample observed. 

6. Decide when to play on the basis of past results — JV dependent on past. 

15.1.2 A useful model — random sums 

As a basic model, we consider the sum of a random number of members of an iid class. In order to have a 
concrete interpretation to help visualize the formal patterns, we think of the demand of a random number of 
customers. We suppose the number of customers JV is independent of the individual demands. We formulate 
a model to be used for a variety of applications. 

A basic sequence {X n : < n) [Demand of n customers] 
An incremental sequence {Y n : < n} [Individual demands] 
These are related as follows: 

n 

X„ = ^ Y fc for n > and X n = for n < Y n = X n - X n _ t for all n (15.1) 

fe=0 



1 This content is available online at <http://cnx.Org/content/m23652/l.5/>. 
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A counting random variable N. If N = n then n of the Y^ are added to give the compound demand 
D (the random sum) 

N oo oo 

D = E Y " = E J {w= fc }^ = J2 hk} ( N ) x k (i5.2) 

fe=0 fe=0 k=0 

Note. In some applications the counting random variable may take on the idealized value oo. For example, 
in a game that is played until some specified result occurs, this may never happen, so that no finite value 
can be assigned to N. In such a case, it is necessary to decide what value X m is to be assigned. For JV 
independent of the Y n (hence of the X n ), we rarely need to consider this possibility. 

Independent selection from an iid incremental sequence 

We assume throughout, unless specifically stated otherwise, that: 

1. X = Y = 

2- {Yfc : 1 < k} is iid 

3. {N, Yk : < k} is an independent class 

We utilize repeatedly two important propositions: 

1. E [h (D) \N = n] = E[h (!„)] , n > 0. 

2. Mr, (s) = qm [My (s)]. If the Y n are nonnegative integer valued, then so is D and gp (s) = gn [gy (s)] 

DERIVATION 

We utilize properties of generating functions, moment generating functions, and conditional expectation. 

1. E [l{ n } (N) h (.D)] = E[h(D)\N = n]P (N = n) by definition of conditional expectation, given an 
event. Now, I {n} {N) h (D) = I {n} (N) h (X n ) and E [l {n} {N) h (X n )] = P(N = n)E[h (X n )]. Hence 
E [h (D) \N = n] P (N = n) = P (N = n) E [h (X n )]. Division by P (N = n) gives the desired result. 

2. By the law of total probability (CElb), M D (s) = E [e sD ] = E{E [e sD \N]}. By proposition 1 and the 
product rule for moment generating functions, 

n 

E [e sD \N = n]=E [e sX "} = ]J E [e sYk ] = M Y (s) (15.3) 

fe=i 

Hence 

oo 

M D (s) = Y, M Y (s) P(N = n) = g N [M Y (*)] (15.4) 

A parallel argument holds for gyy in the integer-valued case. 

— D 

Remark. The result on Mry and gu may be developed without use of conditional expectation. 



M D (a) = E [e sD ] = £ E [l {N=n} e sX «] =J2p(N = n)E [e sX -\ (15.5) 

oc 

= Y,P(N = n)M Y { S )=g N [M Y {s)] (15.6) 



fc=0 

D 



Example 15.1: A service shop 

Suppose the number JV of jobs brought to a service shop in a day is Poisson (8). One fourth of 
these are items under warranty for which no charge is made. Others fall in one of two categories. 
One half of the arriving jobs are charged for one hour of shop time; the remaining one fourth are 
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charged for two hours of shop time. Thus, the individual shop hour charges Y^ have the common 
distribution 

Y=[0 1 2] with probabilities PY = [1/4 1/2 1/4] (15.7) 

Make the basic assumptions of our model. Determine P (D < 4). 
SOLUTION 

g N (s) = e 8 ^ 1 ' g Y (a) = \ (l + 2s + s 2 ) (15.8) 

According to the formula developed above, 

9D (a) = g N [gy (a)] = exp ((8/4) (l + 2s + s 2 ) - 8) = e 4s e 2s V 6 (15.9) 

Expand the exponentials in power series about the origin, multiply out to get enough terms. The 
result of straightforward but somewhat tedious calculations is 

g D (s) = e' 6 fl + 4s + 10s 2 + y s 3 + ^s 4 + • • ■ ) (15.10) 

Taking the coefficients of the generating function, we get 

P (D < 4) « e- 6 (l + 4 + 10 + — + — ] = e- 6 — « 0.1545 (15.11) 

Example 15.2: A result on Bernoulli trials 

Suppose the counting random variable N ~ binomial (n, p) and Yi = Ie^ with P (Ei) = po- Then 

9N = {q + ps) n and g Y (s) = q + p a s (15.12) 

By the basic result on random selection, we have 

9d (s) = gN [gv (s)] = [q + p{qo + Pos)] n = [(1 - pp ) + pp s] n (15.13) 

so that D ~ binomial (n, ppo)- 

In the next section we establish useful m-procedures for determining the generating function gu and the mo- 
ment generating function Mb for the compound demand for simple random variables, hence for determining 
the complete distribution. Obviously, these will not work for all problems. It may helpful, if not entirely 
sufficient, in such cases to be able to determine the mean value E [D] and variance Var [D]. To this end, we 
establish the following expressions for the mean and variance. 

Example 15.3: Mean and variance of the compound demand 

E[D] = E [N] E [Y] and Var [D] = E [N] Var [Y] + Var [N] E 2 [Y] (15.14) 

DERIVATION 



E[D]=E 



/ t I{N=n}X„ 



Y J P{N = n)E[X n ] (15.15) 

n=0 

= E [Y] ^ nP (N = n) = E [Y] E [N] (15.16) 

n=0 

oo oo 

E [V 2 ] =Y. P ( N = n ) E i X n] = Y, P ( N = Tl ') i Var [ X «] + ^ t X «]} ( 15 - 17 ) 



n=0 
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Hence 



^T P (N = n) {nVar [Y] + n 2 E 2 [Y]} = E [N] Var [Y] + E [N 2 ] E 2 [Y] 



(15.18) 



n=0 



Var [D] = E [N] Var [Y] + E [N 2 ] E 2 [Y] - E[N} 2 E 2 [Y] = E [N] Var [Y] + Var [TV] E 2 [Y] (15.19) 



Example 15.4: Mean and variance for Example 15.1 ( A service shop) 

E [N] = Var [N] = 8. By symmetry E [Y] = 1. Var [Y] = 0.25 (0 + 2 + 4) - 1 = 0.5. Hence, 



E [D] = 8 • 1 



Var [D] = 8 • 0.5 + 8 • 1 = 12 



(15.20) 



15.1.3 Calculations for the compound demand 

We have m-procedures for performing the calculations necessary to determine the distribution for a composite 
demand D when the counting random variable JV and the individual demands Y^ are simple random variables 
with not too many values. In some cases, such as for a Poisson counting random variable, we are able to 
approximate by a simple random variable. 

The procedure gend 

If the Yj are nonnegative, integer valued, then so is D, and there is a generating function. We examine 
a strategy for computation which is implemented in the m-procedure gend. Suppose 



9n (s) =Po + Pis + P2S 2 H h p n s r ' 



(15.21) 



g Y (s) = TT + TTlS + TT 2 S 2 + ■ ■ ■ + 



(15.22) 



The coefficients of gjv and gy are the probabilities of the values of JV and Y, respectively. We enter these 
and calculate the coefficients for powers of gy: 



gN = \popi ■■■ Pn] 

y = [71"0 7Tl • • • 7T m 



1 x (n + 1) Coefficients of g^ 
1 x (m + 1) Coefficients of gy 



y2 = conv (y, y) 
2/3 = conv (y, j/2) 



1 x (2m + 1) Coefficients of g Y 
1 x (3m + 1) Coefficients of g Y 



(15.23) 



yn = conv (y, y (n — 1)) 1 x (nm + 1) Coefficients of g Y 

We wish to generate a matrix P whose rows contain the joint probabilities. The probabilities in the ith row 
consist of the coefficients for the appropriate power of gy multiplied by the probability JV has that value. 
To achieve this, we need a matrix, each of whose n + 1 rows has nm + 1 elements, the length of yn. We 
begin by "preallocating" zeros to the rows. That is, we set P = zeros (n + 1, n * m + 1). We then replace the 
appropriate elements of the successive rows. The replacement probabilities for the ith row are obtained by 
the convolution of gy and the power of gy for the previous row. When the matrix P is completed, we remove 
zero rows and columns, corresponding to missing values of JV and D (i.e., values with zero probability). To 
orient the joint probabilities as on the plane, we rotate P ninety degrees counterclockwise. With the joint 
distribution, we may then calculate any desired quantities. 
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Example 15.5: A compound demand 

The number of customers in a major appliance store is equally likely to be 1, 2, or 3. Each customer 
buys 0, 1, or 2 items with respective probabilities 0.5, 0.4, 0.1. Customers buy independently, 
regardless of the number of customers. First we determine the matrices representing gjv and gy. The 
coefficients are the probabilities that each integer value is observed. Note that the zero coefficients 
for any missing powers must be included. 

gN~=~ (1/3) * [0~1~1~1] ; y,~Note~zero~coef f icient~f or~missing~zero~power 

gY~=~0 . 1* [5~4~1] ; /.~All~powers~0~thm~2~riave~positive~coef f icients 

gend 

~Do~not~f orget~zero~ coefficient s~f or ~missing~powers 

Enter~the~gen~fn~COEFFICIENTS~for~gN~gN °/„~Coef f icient~matrix~named~gN 

Enter~the~gen~fn~COEFFICIENTS~for~gY~gY °/„~Coef f icient~matrix~named~gY 

Result s~ are" in~N , ~PN , ~Y , ~PY , ~D , ~PD , ~P 
May~use~jcalc~or~jcalcf ~on~N,~D, ~P 
To~view~distribution~f or~D, ~call~f or~gD 
disp(gD) °/.~Optional~display~of ~complete~distribution 

0.2917 

1.0000 0.3667 

2.0000 0.2250 

3.0000 0.0880 

4.0000 0.0243 

5.0000 0.0040 

6.0000 0.0003 

EN~=~N*PN' 

EN~=~~~2 

EY~=~Y*PY' 

EY~=~~0.6000 

ED~=~D*PD' 

ED~=~~1 . 2000 °/,~Agrees~with~theoretical~EN*EY 

P3~=~(D>=3)*PD' 

P3~~=~0.1167 

[N,D,t,u,PN,PD,PL]~=~jcalcf (N,D,P); 
EDn~=~sum(u.*P) ./sum(P) ; 
disp([N;EDn] ') 

1 . 0000 . 6000 7„~Agrees~with~theoretical~E [D I N=n] ~=~n*EY 

2.0000 1.2000 

3.0000 1.8000 

VD~=~ (D . ~2) *PD ' ~-~ED~2 

VD~=~~1.1200 °/.~Agrees~with~theoretical~EN*VY~+~VN*EY~2 

Example 15.6: A numerical example 



1 



g N ( s ) = t (i + s + s 2 + s 3 + s ^ gy ( s ) = o.i (5s + 3s 2 + 2s 3 ) 

o 

Note that the zero power is missing from gY, corresponding to the fact that P (Y = 0) = 0. 



(15.24) 



gN~=~0.2*[l~l~l~l~l]; 
gY~=~0 . 1* [0~5~3~2] ; /.~Note~th.e~zero~coef f icient~in~the~zero~position 

gend 

Do~not~f orget~zero~coef f icients~f or~missing~powers 
Enter~the~gen~fn~COEFFICIENTS~for~gN~~gN 
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Enter~the~gen~fn~COEFFICIENTS~for~gY~~gY 

Result s~ are" in~N , ~PN , ~Y , ~PY , ~D , ~PD , ~P 

May~use~jcalc~or~jcalcf~on~N,~D, ~P 

To~view~distribution~f or~D, ~call~f or~gD 

disp(gD) y.~Optional~display~of~complete~distribiition 



0.2000 

1.0000 0.1000 

2.0000 0.1100 

3.0000 0.1250 

4.0000 0.1155 

5.0000 0.1110 

6.0000 0.0964 

7.0000 0.0696 

8.0000 0.0424 

9.0000 0.0203 

~~~10.0000 0.0075 

~~~11.0000 0.0019 

~~~12.0000 0.0003 

p3~=~ (D~==~3) *PD' °/.~P (D=3) 

P3~=~~0.1250 

P4_12~=~ ( (D~>=~4)&(D~<=~12) ) *PD' 

P4_12~=~0.4650 °/.~P(4~<=~D~<=~12) 

Example 15.7: Number of successes for random number JV of trials. 

We are interested in the number of successes in JV trials for a general counting random variable. This 
is a generalization of the Bernoulli case in Example 15.2 ( A result on Bernoulli trials). Suppose, 
as in Example 15.5 (A compound demand), the number of customers in a major appliance store is 
equally likely to be 1, 2, or 3, and each buys at least one item with probability p = 0.6. Determine 
the distribution for the number D of buying customers. 

SOLUTION 

We use gN, gY, and gend. 

gN~=~ (1/3) * [0~1~1~1] ; ~ /.~Note~zero~coef i icient~f or~missing~zero~power 
gY~=~ [0 . 4~0 . 6] ; °/.~Generating~f unction~for~the~indicator~f unction 

gend 

Do~not~f orget~zero~coef f icients~f or~missing~powers 
Enter~gen~fn~COEFFICIENTS~for~gN~~gN 
Enter~gen~fn~COEFFICIENTS~for~gY~~gY 
Result s~are~in~N , ~PN , ~Y , ~PY , ~D , ~PD , ~P 
May~use~jcalc~or~jcalcf ~on~N,~D, ~P 
To~view~distribution~f or~D, ~call~f or~gD 
disp(gD) 

0.2080 

1.0000 0.4560 

2.0000 0.2640 

3.0000 0.0720 

The procedure gend is limited to simple JV and Y^, with nonnegative integer values. Sometimes, a random 
variable with unbounded range may be approximated by a simple random variable. The solution in the 
following example utilizes such an approximation procedure for the counting random variable JV. 
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Example 15.8: Solution of the shop time Example 15.1 ( A service shop) 

The number JV of jobs brought to a service shop in a day is Poisson (8). The individual shop hour 
charges Y^ have the common distribution Y = [012] with probabilities PY = [1/41/21/4]. 

Under the basic assumptions of our model, determine P (D < 4). 

SOLUTION 

Since Poisson JV is unbounded, we need to check for a sufficient number of terms in a simple 
approximation. Then we proceed as in the simple case. 

pa~=~cpoisson(8, 10 : 5: 30) °/,~Ch.eck~f or "sufficient ~number~of "terms 

pa~=~~~0.2834 0.0173 0.0003 0.0000 0.0000 

p25~=~cpoisson(8,25) °/ ~Check~on~choice~of ~n~=~25 

p25~=~~1.1722e-06 

gN~=~ipoisson(8,0 : 25) ; °/,~Approximate~gN 

gY~=~0.25*[l~2~l] ; 

gend 

Do~not~f orget~zero~coef f icients~f or~missing~powers 

Enter~gen~fn~COEFFICIENTS~for~gN~~gN 

Enter~gen~fn~COEFFICIENTS~for~gY~~gY 

Result s~ are" in~N , ~PN , ~Y , ~PY , ~D , ~PD , ~P 

May~use~jcalc~or~jcalcf ~on~N,~D, ~P 

To~view~distribution~f or~D, ~call~f or~gD 

disp(gD(D<=20, :)) °/.~Calculated~values~to~D~=~50 

0.0025 °/„~Display~for~D~<=~20 

1.0000 0.0099 

2.0000 0.0248 

3.0000 0.0463 

4.0000 0.0711 

5.0000 0.0939 

6.0000 0.1099 

7.0000 0.1165 

8.0000 0.1132 

9.0000 0.1021 

~~~10.0000 0.0861 

~~~11.0000 0.0684 

~~~12.0000 0.0515 

~~~13.0000 0.0369 

~~~14.0000 0.0253 

~~~15.0000 0.0166 

~~~16.0000 0.0105 

~~~17.0000 0.0064 

~~~18.0000 0.0037 

~~~19.0000 0.0021 

~~~20.0000 0.0012 

sum(PD) °/o~Ch.eck~on~suf f iciency~of "approximation 

ans~=~~ 1.0000 

P4~=~(D<=4)*PD' 

P4~= . 1545 °/.~Theoretical~value~ (4~~places) ~=~0 . 1545 

ED~=~D*PD' 

ED~= 8.0000 7„~Theoretical~=~8~~(Example~15.4 (Mean and variance for Example~15.1 ( A 

VD~=~ (D . -2) *PD ' ~-~ED~2 

VD~=~~11.9999 °/.~Theoretical~=~12~(Example~15.4 (Mean and variance for Example~15.1 ( A 
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The m-procedures mgd and jmgd 

The next example shows a fundamental limitation of the gend procedure. The values for the individual 
demands are not limited to integers, and there are considerable gaps between the values. In this case, we 
need to implement the moment generating function Mb rather than the generating function gry. 

In the generating function case, it is as easy to develop the joint distribution for {N, D} as to develop the 
marginal distribution for D. For the moment generating function, the joint distribution requires considerably 
more computation. As a consequence, we find it convenient to have two m-procedures: mgd for the marginal 
distribution and jmgd for the joint distribution. 

Instead of the convolution procedure used in gend to determine the distribution for the sums of the 
individual demands, the m-procedure mgd utilizes the m-function mgsum to obtain these distributions. The 
distributions for the various sums are concatenated into two row vectors, to which csort is applied to obtain 
the distribution for the compound demand. The procedure requires as input the generating function for JV 
and the actual distribution, Y and PY, for the individual demands. For gN, it is necessary to treat the 
coefficients as in gend. However, the actual values and probabilities in the distribution for Y are put into a 
pair of row matrices. If Y is integer valued, there are no zeros in the probability matrix for missing values. 

Example 15.9: Noninteger values 

A service shop has three standard charges for a certain class of warranty services it performs: $10, 
$12.50, and $15. The number of jobs received in a normal work day can be considered a random 
variable JV which takes on values 0, 1, 2, 3, 4 with equal probabilities 0.2. The job types for arrivals 
may be represented by an iid class {Y, : 1 < i < 4}, independent of the arrival process. The Y,- 
take on values 10, 12.5, 15 with respective probabilities 0.5, 0.3, 0.2. Let C be the total amount of 
services rendered in a day. Determine the distribution for C. 
SOLUTION 

gN~=~0.2*[l~l~l~l~l]; °/.~Enter~data 

Y~=~[10~12.5~15] ; 

PY~=~0.1*[5~3~2] ; 

mgd °/,~Call~f or "procedure 

Enter~gen~fn~COEFFICIENTS~for~gN~~gN 

Enter~VALUES~for~Y~~Y 

Enter~PROBABILITIES~for~Y~~PY 

Values~are~in~row~matrix~D; ~probabilities~are~in~PD. 

To~view~the~distribution, ~call~f or~mD . 

disp(mD) °/,~0ptional~display~of "distribution 





o~~~ 


0.2000 


~~ ~10 


,0000~~~ 


~0.1000 


~~~12. 


,5000~~~ 


~0.0600 


~~~15. 


,0000~~~ 


~0.0400 


~~ ~20 


,0000~~~ 


~0.0500 


~~ ~22 


,5000~~~ 


~0.0600 


~~~25. 


,0000~~~ 


~0.0580 


~~~27. 


,5000~~~ 


~0.0240 


~~~30. 


,0000~~~ 


~0.0330 


~~ ~32 


,5000~~~ 


~0.0450 


~~~35. 


,0000~~~ 


~0.0570 


~~ ~37 


,5000~~~ 


~0.0414 


~~~40. 


,0000~~~ 


~0.0353 


~~~42 


,5000~~~ 


~0.0372 


~~ ~45 


,0000~~~ 


~0.0486 


~~~47. 


,5000~~~ 


~0.0468 


~~ ~50 


,0000~~~ 


~0.0352 


~~~52. 


,5000~~~ 


~0.0187 
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~~~55.0000 0.0075 

~~~57.5000 0.0019 

~~~60.0000 0.0003 

We next recalculate Example 15.6 (A numerical example), above, using mgd rather than gend. 

Example 15.10: Recalculation of Example 15.6 (A numerical example) 

In Example 15.6 (A numerical example), we have 

9N (s) = - (l + s + s 2 + s 3 + s 4 ) g Y (s) = 0.1 (5s + 3s 2 + 2s 3 ) (15.25) 

o 

This means that the distribution for Y is Y = [123] and PY = 0.1 * [532]. 

We use the same expression for gN as in Example 15.6 (A numerical example). 

gN~=~0.2*ones(l,5) ; 
Y~=~l:3; 

PY~=~0.1*[5~3~2] ; 
mgd 

Enter~gen~fn~COEFFICIENTS~for~gN~~gN 
Enter~VALUES~for~Y~~Y 
Enter~PROBABILITIES~for~Y~~PY 

Values~are~in~row~matrix~D; ~probabilities~are~in~PD. 
To~view~the~distribution, ~call~f or~mD . 
disp(mD) 

0.2000 

1.0000 0.1000 

2.0000 0.1100 

3.0000 0.1250 

4.0000 0.1155 

5.0000 0.1110 

6.0000 0.0964 

7.0000 0.0696 

8.0000 0.0424 

9.0000 0.0203 

~~~10.0000 0.0075 

~~~11.0000 0.0019 

~~~12.0000 0.0003 

P3~=~(D==3)*PD' 

P3~=~~~0.1250 

ED~=~D*PD' 

ED~=~~~3.4000 

P_4_12~=~((D>=4)&(D<=12))*PD' 

P_4_12~=~~0.4650 

P7~=~(D>=7)*PD' 

P7~=~~~0.1421 

As expected, the results are the same as those obtained with gend. 

If it is desired to obtain the joint distribution for {TV, D}, we use a modification of mgd called jmgd. The 
complications come in placing the probabilities in the P matrix in the desired positions. This requires 
some calculations to determine the appropriate size of the matrices used as well as a procedure to put each 
probability in the position corresponding to its D value. Actual operation is quite similar to the operation 
of mgd, and requires the same data format. 
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A principle use of the joint distribution is to demonstrate features of the model, such as E [D\N = n] = 
nE[Y], etc. This, of course, is utilized in obtaining the expressions for Mp (s) in terms of g^ (s) and 
My (s). This result guides the development of the computational procedures, but these do not depend upon 
this result. However, it is usually helpful to demonstrate the validity of the assumptions in typical examples. 

Remark. In general, if the use of gend is appropriate, it is faster and more efficient than mgd (or jmgd). 
And it will handle somewhat larger problems. But both m-procedures work quite well for problems of 
moderate size, and are convenient tools for solving various "compound demand" type problems. 

15.2 Some Random Selection Problems 2 

In the unit on Random Selection (Section 15.1), we develop some general theoretical results and compu- 
tational procedures using MATLAB. In this unit, we extend the treatment to a variety of problems. We 
establish some useful theoretical results and in some cases use MATLAB procedures, including those in the 
unit on random selection. 

15.2,1 The Poisson decomposition 

In many problems, the individual demands may be categorized in one of m types. If the random variable T; 
is the type of the ith arrival and the class {Ti : 1 < i} is iid, we have multinomial trials. For m = 2 we have 
the Bernoulli or binomial case, in which one type is called a success and the other a failure. 
Multinomial trials 

We analyze such a sequence of trials as follows. Suppose there are m types, which we number 1 through m. 
Let Eki be the event that type k occurs on the ith component trial. For each i, the class {Eki : 1 < k < m) 
is a partition, since on each component trial exactly one of the types will occur. The type on the ith trial 
may be represented by the type random variable 

m 

T % = ]T kI Eki (15.26) 

fe=i 
We assume 

{Ti : 1 < i} is iid, with P (T { = k) = P (E ki ) = Pk invariant with i (15.27) 

In a sequence of n trials, we let Nh n be the number of occurrences of type k. Then 

n m 

Nkn = J2 lE *> Wlth J2 Nkn = fl ( 15 - 28 ) 

j=l fe=l 

Now each N kn ~ binomial (n, Pk). The class {N kn : 1 < k < m) cannot be independent, since it sums to 
n. If the values of m — 1 of them are known, the value of the other is determined. If m + n^ + ■ ■ ■ + n m = n, 
the event 

{7Vi„ = m, N 2n = n 2 , ■ ■■ , N mn = n m } (15.29) 

is one of the 

C (n; m, H2, • • • , n m ) = n\j (ni!n 2 ! • • • n m \) (15.30) 

ways of arranging n 1 of the En, n 2 of the E 2 i, • • • , n m of the E mi . Each such arrangement has probability 
pTpT-'-P^,^ that 

m n k 

P {N ln = ni ,N 2n = n 2 , ■■■ N mn = n m ) = n! TT ^T ( 15 - 31 ) 

Tiki 

fc=l 



2 This content is available online at <http://cnx.Org/content/m23664/l.6/>. 
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This set of joint probabilities constitutes the multinomial distribution. For m = 2, and type 1 a success, 
this is the binomial distribution with parameter (n,pi). 
A random number of multinomial trials 

We consider, in particular, the case of a random number JV of multinomial trials, where N ~ Poisson 
(/i). Let JVi be the number of results of type k in a random number JV of multinomial trials. 

N oo m 

Nk=Y, J E k i = Y, hN=n}N k n with ^ N k = N (15.32) 

i—1 n—1 k—1 

Poisson decomposition 

Suppose 

1. N ~ Poisson (/x) 

2- {T t : 1 < i} is iid with P(T l = k)=p k , I < k < m 

3. {iV, T, : 1 < i} is independent 

Then 

a. Each N k ~ Poisson (/zpfc) 

b. {A^ : 1 < k < m} is independent. 

— D 

The usefulness of this remarkable result is enhanced by the fact that the sum of independent Poisson 
random variables is also Poisson, with u. for the sum the sum of the \i\ for the variables added. This is readily 
established with the aid of the generating function. Before verifying the propositions above, we consider some 
examples. 

Example 15.11: A shipping problem 

The number JV of orders per day received by a mail order house is Poisson (300). Orders are 
shipped by next day express, by second day priority, or by regular parcel mail. Suppose 4/10 of 
the customers want next day express, 5/10 want second day priority, and 1/10 require regular mail. 
Make the usual assumptions on compound demand. What is the probability that fewer than 150 
want next day express? What is the probability that fewer than 300 want one or the other of the 
two faster deliveries? 

SOLUTION 

Model as a random number of multinomial trials, with three outcome types: Type 1 is next day 
express, Type 2 is second day priority, and Type 3 is regular mail, with respective probabilities p\ = 
0.4, p 2 = 0.5, and p 3 = 0.1. Then JVi ~ Poisson (0.4 • 300 = 120), N 2 ~ Poisson (0.5 • 300 = 150), 
and JV 3 ~ Poisson (0.1 • 300 = 30). Also N x + N 2 ~ Poisson (120 + 150 = 270). 

PI = 1 - cpoisson(120,150) 

PI = 0.9954 

P12 = 1 - cpoisson(270,300) 

P12 = 0.9620 

Example 15.12: Message routing 

A junction point in a network has two incoming lines and two outgoing lines. The number of 
incoming messages JVj on line one in one hour is Poisson (50); on line 2 the number is N 2 ~ Poisson 
(45). On incoming line 1 the messages have probability p\ a = 0.33 of leaving on outgoing line a 
and 1 — pi a of leaving on line b. The messages coming in on line 2 have probability p 2a = 0.47 of 
leaving on line a. Under the usual independence assumptions, what is the distribution of outgoing 
messages on line a? What are the probabilities of at least 30, 35, 40 outgoing messages on line a? 

SOLUTION 

By the Poisson decomposition, N a ~ Poisson (50 • 0.33 + 45 • 0.47 = 37.65). 
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ma = 50*0.33 + 45*0.47 
ma = 37.6500 
Pa = cpoisson(ma,30: 5 :40) 
Pa = 0.9119 0.6890 0.3722 

VERIFICATION of the Poisson decomposition 

a- N k =E 2 =iW 

This is composite demand with Y k = lE ki , so that gy k (s) = q k + sp k = 1 + Pk (s — 1)- Therefore, 

9N k (s) = g N [g Yk (a)] = e M(i+P*(-i)-i) = e^t- 1 ) (15.33) 

which is the generating function for Nk ~ Poisson (upk)- 
b. For any m, ri2, • • ■ , n TO , let n = ni + ri2 + • ■ ■ + n m , and consider 

A = {Ni = m, N 2 = n 2 , ■■■ , N m = n m } = {N = n} n {Ni n = m, N 2n = n 2 , • • • , A^n = n TO } 

(15.34) 
Since JV is independent of the class of Ie m , the class 

{{N = n}, {N ln = m, Af 2 n = n 2 , • • ■ , N mn = n m }} (15.35) 

is independent. By the product rule and the multinomial distribution 

n m n k m n k m 

P(A) = e~^.nl]J^ = l[e-^ = l[P(N k = n k ) (15.36) 

ru — _L rZ — J. rv — i. 



The second product uses the fact that 



gA» _ gM(Pl+P2H hPm) 



Y[ e Wfc (15.37) 



fe=i 



Thus, the product rule holds for the class {N k : 1 < k < m}, so that it is independent. 

15.2.2 Extreme values 

Consider an iid class {Yj : 1 < i} of nonnegative random variables. For any positive integer n we let 

V n = min{Y x , y 2 , • • • , Y n ] and W n = max{Y l , Y 2 , • • • , Y n ] (15.38) 

Then 

P (V n >t) = P n {Y > t) and P (W n < t) = P n (Y < t) (15.39) 

Now consider a random number JV of the Y;. The minimum and maximum random variables are 

oo oo 

V N = Y J hN=n}Vn and W N = Y J I{N=n}W n (15.40) 

n=0 n=0 

— D 

Computational formulas 

If we set Vb = Wq = 0, then 

a. F v (t) = P (V < t) = 1 + P (N = 0) - g N [P (Y > t)} 

b. F w (t) = g N [P(Y<t)} 
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oo 

These results are easily established as follows. {Vn > t} = V {N = n}{V n > t}. By additivity and 

n=0 

independence of {N, V n } for each n 



P (V N > t) = J2 P (N = n) P (V n > t) = Y, P (N = n) P n (Y > t) , since P (V > t) = (15.41) 

n— n— 1 

If we add into the last sum the term P (N = 0) P° (Y > i) = P (N = 0) then subtract it, we have 

DC 

P(V N >t) = J2P{N = n)P n (Y>t)-P(N = 0) = g N [P (Y > t)] - P (N = 0) (15.42) 



n=0 



A similar argument holds for proposition (b). In this case, we do not have the extra term for {N = 0}, 
since P (W a < t) = 1. 

Special case. In some cases, N = does not correspond to an admissible outcome (see Example 15.14 
(Lowest Bidder ), below, on lowest bidder and Example 15.16 (Batch testing)). In that case 

Fv(t) = Y,n=iP(Vn<t)P(N = n) = En=A±-P n (Y>t)]P(N = n) = (15.43) 
EZiP(N = n)-ZZiP n (Y>t)P(N = n) 

Add P (N = 0) = P° (Y > t) P {N = 0) to each of the sums to get 

F v (t) = 1 - Y, pn ( Y > t) p ( N = n) = 1 - 9N [P (Y > t)} (15.44) 

n=Q 

— □ 

Example 15.13: Maximum service time 

The number JV of jobs coming into a service center in a week is a random quantity having a Poisson 
(20) distribution. Suppose the service times (in hours) for individual units are iid, with common 
distribution exponential (1/3). What is the probability the maximum service time for the units is 
no greater than 6, 9, 12, 15, 18 hours? SOLUTION 
SOLUTION 

P {W N <t)=g N [P (Y < t)} = e 2 o[-FV(t)-i] = exp (_2 0e -*/ 3 ) (15.45) 



t = 6:3: 


18; 


PW = exp(-20*exp(-t/3)) ; 


disp([t;PW] ' 


') 


6.0000 


0.0668 


9.0000 


0.3694 


12.0000 


0.6933 


15.0000 


0.8739 


18.0000 


0.9516 



Example 15.14: Lowest Bidder 

A manufacturer seeks bids on a modification of one of his processing units. Twenty contractors are 
invited to bid. They bid with probability 0.3, so that the number of bids TV ~ binomial (20,0.3). 
Assume the bids Y,- (in thousands of dollars) form an iid class. The market is such that the bids 
have a common distribution symmetric triangular on (150,250). What is the probability of at least 
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one bid no greater than 170, 180, 190, 200, 210? Note that no bid is not a low bid of zero, hence 

we must use the special case. 

Solution 

P(V < t) = l-g N [P{Y >t)] = l-(0.7 + 0.3p) 20 where p = P (Y > t) 
Solving graphically for p = P (V > t) , we get 

p = [23/25 41/50 17/25 1/2 8/25] for t = [170 180 190 200 210] 
Now g N (s) = (0.7 + 0.3s) 20 . We use MATLAB to obtain 

t = [170 180 190 200 210]; 
p = [23/25 41/50 17/25 1/2 8/25] ; 
PV = 1 - (0.7 + 0.3*p) ."20; 
disp([t;p;PV]') 



170. 


.0000 


0. 


.9200 





.3848 


180. 


.0000 


0. 


,8200 





.6705 


190 


.0000 


0. 


.6800 





.8671 


200. 


.0000 


0. 


,5000 





.9612 


210. 


.0000 


0, 


,3200 





.9896 



Example 15.15: Example 15.14 (Lowest Bidder ) with a general counting variable 

Suppose the number of bids is 1, 2 or 3 with probabilities 0.3, 0.5, 0.2, respectively. 

Determine P (V < t) in each case. 

SOLUTION. 

The minimum of the selected Y' s is no greater than t if and only if there is at least one Y less 
than or equal to t. We determine in each case probabilities for the number of bids satisfying Y < t. 
For each t, we are interested in the probability of one or more occurrences of the event Y < t. This 
is essentially the problem in Example 7 (Example 15.7: Number of successes for random number N 
of trials.) from "Random Selection", with probability p = P (Y < t). 



t = [170 180 190 200 210]; 
p = [23/25 41/50 17/25 1/2 8/25] 
gN = [0 0.3 0.5 0.2]; 
PV = zeros(l,length(t)) ; 
for i=l : length (t) 
gY = [p(i),l - p(i)]; 
[d,pd] = gendf (gN.gY); 
PV(i) = (d>0)*pd'; 
end 



disp([t;PV] 
170.0000 
180.0000 
190.0000 
200.0000 
210.0000 



) 



0.1451 
0.3075 
0.5019 
0.7000 
0.8462 



'/, Probabilities Y <= t are 
'/, Zero for missing value 



'/, Selects positions for d > and 
'/, adds corresponding probabilities 



Example 15.14 (Lowest Bidder ) may be worked in this manner by using gN = 
ibinom(20, 0.3, 0:20). The results, of course, are the same as in the previous solution. The 
fact that the probabilities in this example are lower for each t than in Example 15.14 (Lowest 
Bidder ) reflects the fact that there are probably fewer bids in each case. 

Example 15.16: Batch testing 

Electrical units from a production line are first inspected for operability. However, experience 
indicates that a fraction p of those passing the initial operability test are defective. All operable 
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units are subsequenly tested in a batch under continuous operation ( a "burn in" test). Statistical 
data indicate the defective units have times to failure Y,- iid, exponential (A), whereas good units 
have very long life (infinite from the point of view of the test). A batch of n units is tested. Let V 
be the time of the first failure and JV be the number of defective units in the batch. If the test goes 
t units of time with no failure (i.e., V > t), what is the probability of no defective units? 

SOLUTION 

Since no defective units implies no failures in any reasonable test time, we have 

P(N = 0) 
{N = 0}c{V>t} so that P{N = 0\V>t) = — ±— ^ (15.46) 

Since N = does not yield a minimum value, we have P (V > t) = g^ [P (Y > t)}. Now under the 
condition above, the number of defective units TV ~ binomial (n, p), so that g^ (s) = (q + ps) n . If 
JV is large and p is reasonably small, JV is approximately Poisson (np) with g^ (s) = e np ^ s ~ 1 ' and 
P (N = 0) = e~ n P. Now P (Y > t) = e~ xt ; for large n 

p—np 

P(N = 0\V>t)= enp[P{Y>t) ^ = e-"^><) = e— " ' (15.47) 

For n = 5000, p = 0.001, A = 2, and t = 1,2,3,4,5, MATLAB calculations give 

t = 1:5; 
n = 5000; 
p = 0.001; 
lambda = 2; 

P = exp(-n*p*exp(-lambda*t) ) ; 
disp([t;P]') 

1.0000 0.5083 

2.0000 0.9125 

3.0000 0.9877 

4.0000 0.9983 

5.0000 O.i 



It appears that a test of three to five hours should give reliable results. In actually designing the 
test, one should probably make calculations with a number of different assumptions on the fraction 
of defective units and the life duration of defective units. These calculations are relatively easy to 
make with MATLAB. 



15.2,3 Bernoulli trials with random execution times or costs 

Consider a Bernoulli sequence with probability p of success on any component trial. Let JV be the number 
of the trial on which the first success occurs. Let Y; be the time (or cost) to execute the ith trial. Then the 
total time (or cost) from the beginning to the completion of the first success is 

N 

T = y Yj (composite "demand" with JV — 1 ~ geometric p (15.48) 

We suppose the Y; form an iid class, independent of JV. Now N — 1 ~ geometric (p) implies 
9n (s) = psj (1 — qs), so that 

M T (s) = g N [M Y (a)] = i ^^ (15-49) 

There are two useful special cases: 
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1. Yi ~ exponential (A), so that My (s) = -^ 



M T (s) 



pA/ (A — s) pA 



1 — gA/ (A — s) p\ — s 



(15.50) 



which implies T ~ exponential (p\). 



2. Yi — 1 ~ geometric (po), so that gy (s) = j 







9t(s) 



pp s 



pp s/ (1 - q s) 

1-ppos/ (1- q s) l-(l-pp )s 



(15.51) 



so that T — 1 ~ geometric {pp, 



'())■ 



Example 15.17: Job interviews 

Suppose a prospective employer is interviewing candidates for a job from a pool in which twenty 
percent are qualified. Interview times (in hours) Y,- are presumed to form an iid class, each expo- 
nential (3). Thus, the average interview time is 1/3 hour (twenty minutes). We take the probability 
for success on any interview to be p = 0.2. What is the probability a satisfactory candidate will 
be found in four hours or less? What is the probability the maximum interview time will be no 
greater than 0.5, 0.75, 1, 1.25, 1.5 hours? 
SOLUTION 



T ~ exponential (0.2 • 3 = 0.6), so that P (T < 4) = 1 



-0.6-4 



0.9093. 



P(W<t)=g N \P(Y<t)} 

MATLAB computations give 

t = 0.5:0.25:1.5; 
PWt = (1 - exp(-3*t))./(l + 4*exp(-3*t)); 
disp([t;PWt] ') 

0.5000 0.4105 

0.7500 0.6293 

1.0000 0.7924 

1.2500 0.8925 

1.5000 0.9468 



0.2(1 



1 



-3t 



1 -0.8(1 -e" 3 *) l + 4e 



-:u 



(15.52) 



The average interview time is 1/3 hour; with probability 0.63 the maximum is 3/4 hour or less; 
with probability 0.79 the maximum is one hour or less; etc. 

In the general case, solving for the distribution of T requires transform theory, and may be handled best by 
a program such as Maple or Mathematica. 

For the case of simple Y,-, we may use approximation procedures based on properties of the geometric 
series. Since N — 1 ~ geometric (p), 



9n(s) = ^ 
ps 



Eoo / \k 
k=o (V s ) 



ELUM +(qs) n E^o(^ 



ps 



fc=0 w s ) +£*=„+! M 



(15.53) 



ps 



£M fe 



.fc=0 



+ {qs) n+l g N (s) = g n (s) + {qs) n+1 g N (s) 



(15.54) 
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Note that g n (s) has the form of the generating function for a simple approximation JV n which matches 
values and probabilities with JV up to k = n. Now 

9t (s) = g n [gv 0)] + {qsj n+1 gN [gv (s)] (15.55) 

The evaluation involves convolution of coefficients which effectively sets s = 1. Since g^ (1) = gy (1) = 1, 

{qs) n+l g N [gv {a)] for s = 1 reduces to q n+1 = P (N > n) (15.56) 

which is negligible if n is large enough. Suitable n may be determined in each case. With such an n, if the 
Yj are nonnegative, integer- valued, we may use the gend procedure on g n [gy (s)], where 

g n (s) = ps + pqs 2 + pq 2 s i H h pq n s n+1 (15.57) 

For the integer-valued case, as in the general case of simple Yj, we could use mgd. However, gend is usually 
faster and more efficient for the integer-valued case. Unless q is small, the number of terms needed to 
approximate g n is likely to be too great. 

Example 15.18: Approximating the generating function 

Let p = 0.3 and Y be uniformly distributed on {1, 2, • • • , 10}. Determine the distribution for 

N 



]T Y k (15.58) 



fc=i 



SOLUTION 



p = 0.3; 
q = 1 - p; 

a = [30 35 40] ; '/. Check for suitable n 

b = q. "a 
b = 1.0e-04 * '/. Use n = 40 

0.2254 0.0379 0.0064 
n = 40; 
k = l:n; 

gY = 0.1*[0 ones(l,10)] ; 

gN = p*[0 q.~(k-l)]; '/. Probabilities, <= k <= 40 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN gN 
Enter gen fn COEFFICIENTS for gY gY 
Values are in row matrix D; probabilities are in PD. 
To view the distribution, call for gD. 
sum(PD) % Check sum of probabilities 

ans = 1.0000 

FD = cumsum(PD) ; '/, Distribution function for D 
plot(0:100,FD(l:101)) '/. See Figure~15.1 
P50 = (D<=50)*PD' 
P50 = 0.9497 
P30 = (D<=30)*PD' 
P30 = 0.8263 
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Execution Time Distribution Function 




Figure 15.1: Execution Time Distribution Function Fj> 



The same results may be achieved with mgd, although at the cost of more computing time. In that case, 
use gN as in Example 15.18 ( Approximating the generating function), but use the actual distribution for 
Y. 

15.2,4 Arrival times and counting processes 

Suppose we have phenomena which take place at discrete instants of time, separated by random waiting 
or interarrival times. These may be arrivals of customers in a store, of noise pulses on a communications 
line, vehicles passing a position on a road, the failures of a system, etc. We refer to these occurrences as 
arrivals and designate the times of occurrence as arrival times. A stream of arrivals may be described in 
three equivalent ways. 



• Arrival times: {S n : < n}, with = Srj < S\ < ■ 

• Interarrival times: {Wi : 1 < i}, with each Wi > 



■ a.s. (basic sequence) 
.s. (incremental sequence) 



The strict inequalities imply that with probability one there are no simultaneous arrivals. The relations 
between the two sequences are simply 



S = 0, S n = J2 W i and W n 



S n -i for all n > 1 



(15.59) 



The formulation indicates the essential equivalence of the problem with that of the compound demand 
(Section 15.1.3: Calculations for the compound demand). The notation and terminology are changed to 
correspond to that customarily used in the treatment of arrival and counting processes. 
The stream of arrivals may be described in a third way. 
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• Counting processes: N t = N (t) is the number of arrivals in time period (0, i\. It should be clear that 
this is a random quantity for each nonnegative t. For a given t, uj the value is N (t, to). Such a family of 
random variables constitutes a random process. In this case the random process is a counting process. 

We thus have three equivalent descriptions for the stream of arrivals. 

{S n : < n} {W n : 1 < n] {N t : < t} (15.60) 

Several properties of the counting process JV should be noted: 

1. N{t+h) - N (t) counts the arrivals in the interval (t, t + h], h > 0, so that N (t + h) > N (t) for 
h > 0. 

2. No = and for t > we have 

oo 

Nt = 2_, ^(0,t] {Si) = max{n : S n < i} = min{n : S n +i > t} (15.61) 

8 = 1 

3. For any given u>, N (•, w) is a nondecreasing, right-continuous, integer-valued function defined on [0, oo), 
with N (0, lu) = 0. 

The essential relationships between the three ways of describing the stream of arrivals is displayed in 

W n = S n -S n - lt {N t > n} = {S n < t}, {N t = n} = {S n < t < S n+1 } (15.62) 

This imples 

P {N t = n) = P (S n <t)-P (5„+i < t) = P {S n+1 >t)-P{S n > t) (15.63) 

Although there are many possibilities for the interarrival time distributions, we assume 

{W t : 1 < i} is iid, with W % > a.s. (15.64) 

Under such assumptions, the counting process is often referred to as a renewal process and the interrarival 
times are called renewal times. In the literature on renewal processes, it is common for the random variable 
to count an arrival at t = 0. This requires an adjustment of the expressions relating N t and the S;. We use 
the convention above. 

Exponential iid interarrival times 

The case of exponential interarrival times is natural in many applications and leads to important math- 
ematical results. We utilize the following propositions about the arrival times S n , the interarrival times W;, 
and the counting process JV. 

a. If {Wi : 1 < i} is iid exponential (A), then S n ~ gamma (n, A) for all n > 1. This is worked out 
in the unit on TRANSFORM METHODS, in the discussion of the connection between the gamma 
distribution and the exponential distribution. 

b. S n ~ gamma (n, A) for all n > 1, and So = 0, iff Nt ~ Poisson (At) for all t > 0. This follows the 
result in the unit DISTRIBUTION APPROXI9MATIONS on the relationship between the Poisson 
and gamma distributions, along with the fact that {N t > n} = {S n < t}. 

Remark. The counting process is a Poisson process in the sense that Nt ~ Poisson (At) for all t > 0. More 
advanced treatments show that the process has independent, stationary increments. That is 

1. N {t + h) - N (t) = N (h) for all t, h > 0, and 

2. For h < t 2 < t 3 < U < ■ ■ ■ < tm-i < tm, the class {N {t 2 ) - N (JVi) , N (t 4 ) - N (t 3 ) , ■■• , N (t m ) - 
N (t m -i)} is independent. 
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In words, the number of arrivals in any time interval depends upon the length of the interval and not its 
location in time, and the numbers of arrivals in nonoverlapping time intervals are independent. 

Example 15.19: Emergency calls 

Emergency calls arrive at a police switchboard with interarrival times (in hours) exponential (15). 
Thus, the average interarrival time is 1/15 hour (four minutes). What is the probability the number 
of calls in an eight hour shift is no more than 100, 120, 140? 

p = 1 - cpoisson(8*15, [101 121 141]) 
p = 0.0347 0.5243 0.9669 

We develop next a simple computational result for arrival processes for which S n ~ gamma (n, A). 

Example 15.20: Gamma arrival times 

Suppose the arrival times S n ~ gamma (n, A) and g is such that 



/•OO 

/ \g\ < oo and E 
Jo 



Then 



E 



^2g(Sn 



X>(s») 



A / g 
Jo 



< oo 



(15.65) 



(15.66) 



VERIFICATION 

We use the countable sums property (E8b) (list, p. 600) for expectation and the corresponding 
property for integrals to assert 



E 



X>(S„) 



.71= 1 



J2 E l9 (Sn)} =J2 9(t)fn (t) dt where /„ (t) 



Xe~ M {Xt) 
(n-1)! 



n-l 



We may apply (E8b) (list, p. 600) to assert 

oo /.oo 



E 



gfn 







/.oo oo 

/ 5^^" 

7o n=l 



(15.67) 



(15.68) 



Since 



J2 In (t) = \e~* Y, 



(xty 



Ae 



-\t\t 



A 



(n-l)! 

n=l n=l y ' 

the proposition is established. 

Example 15.21: Discounted replacement costs 

A critical unit in a production system has life duration exponential (A). Upon failure the unit is 
replaced immediately by a similar unit. Units fail independently. Cost of replacement of a unit is 
c dollars. If money is discounted at a rate a, then a dollar spent t units of time in the future has a 
current value e~ at . If S n is the time of replacement of the nth unit, then S n ~ gamma (n, A) and 
the present value of all future replacements is 



(15.69) 



<? = £■ 



-aS n 



(15.70) 



71 = 1 
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The expected replacement cost is 



E[C]=E 



J2g(s n ) 



where g (t) = ce 



(15.71) 



Hence 



e[c]=x r 

Jo 



ce~ at dt 



(15.72) 



Suppose unit replacement cost c = 1200, average time (in years) to failure 1/A = 1/4, and the 
discount rate per year a = 0.08 (eight percent). Then 



E 1200,4 

L J 0.08 



(15.73) 



Example 15.22: Random costs 

Suppose the cost of the nth replacement in Example 15.21 ( Discounted replacement costs) is a 
random quantity C n , with {C n , S n } independent and E [C n ] = c, invariant with n. Then 



E[C] = E 



E^ e ~ QS " 



J2E[C n ]E[e- aS "]=J2cE[e- aS *] 



Ac 

a 



(15.74) 



The analysis to this point assumes the process will continue endlessly into the future. Often, it is desirable 
to plan for a specific, finite period. The result of Example 15.20 ( Gamma arrival times) may be modified 
easily to account for a finite period, often referred to as a finite horizon. 

Example 15.23: Finite horizon 

Under the conditions assumed in Example 15.20 ( Gamma arrival times), above, let N t be the 
counting random variable for arrivals in the interval (0, t]. 



N t ,.t 

If Z t = Y,9 (Sn) , then E[Z t ] = X g (u) 
n=i Ja 



du 



(15.75) 



VERIFICATION 



^N, 



Since N t > n iff S n < t, J2 n Li 9 (Sn) = T, n =o J (o, t] (S n ) 9 (S n ). In the result of Example 15.20 



( Gamma arrival times), replace g by I(o,t]9 an d note that 

OO ft 

l(0,t] ( u ) g{u) du= j g (u) du 
o Jo 



(15.76) 



Example 15.24: Replacement costs, finite horizon 

Under the conditions of Example 15.21 ( Discounted replacement costs), consider the replacement 
costs over a two-year period. 
SOLUTION 



E [C] = Ac [ e~ au 
Jo 



du ■■ 



Xc 



(1 



(15.77) 



For 



Thus, the expected cost for the infinite horizon Ac/a is reduced by the factor 1 — e 
t = 2 and the numbers in Example 15.21 ( Discounted replacement costs), the reduction factor is 
1 - e" 016 = 0.1479 to give E [C] = 60000 • 0.1479 = 8, 871.37. 
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In the important special case that g (u) = ce~ au , the expression for E E^Li 9 (Sn)) ma Y be put into a form 
which does not require the interarrival times to be exponential. 

Example 15.25: General interarrival, exponential g 

Suppose So = and S n = Yl7=i ^i, where {Wi : 1 < i} is iid. Let {V n : 1 < n} be a class such 
that each E [V n ] = c and each pair {V n , S n } is independent. Then for a > 



E[C]=E 



5>> e ~ aS " 



n=\ 



c Mw{ ~ a) , (15-78) 

1 - M w {-a) y ' 



where Mw is the moment generating function for W. 
DERIVATION 

First we note that 

E [V n e~ aS -} = cM Sn {-a) = cM w {-a) (15.79) 

Hence, by properties of expectation and the geometric series 

E\C] = cJ2M w (-a) = ^'^ r , provided \M W (-a) \ < 1 (15.80) 

^ 1 - M w (-a) 

Since a > and W > 0, we have < e~ aW < 1, so that M w (-a) = E [e~ aW ] < 1. 

Example 15.26: Uniformly distributed interarrival times 

Suppose each Wi ~ uniform (a, b). Then (see Appendix C (Section 17.3)), 

e -aa _ e -ba e~ aa — e~ ha 

M w (-a) = -, — so that E [C] = c ■ — r -. t-. r (15.81) 

v ' a{b- a) L J a {b - a) - [e- aa - e- ba ] v ; 

Let a = 1, b = 5, c = 100 and a = 0.08. Then, 

a = 1; 
b = 5; 
c = 100; 
A = 0.08; 

MW = (exp(-a*A) - exp(-b*A) )/(A* (b - a)) 
MW = 0.7900 

EC = c*MW/(l - MW) 
EC = 376.1643 



15.3 Problems on Random Selection 3 

Exercise 15.1 (Solution on p. 482.) 

(See Exercise 3 (Exercise 8.3) from "Problems on Random Variables and Joint Distributions") A 
die is rolled. Let X be the number of spots that turn up. A coin is flipped X times. Let Y be the 
number of heads that turn up. Determine the distribution for Y. 

Exercise 15.2 (Solution on p. 482.) 

(See Exercise 4 (Exercise 8.4) from "Problems on Random Variables and Joint Distributions") As 
a variation of Exercise 15.1, suppose a pair of dice is rolled instead of a single die. Determine the 
distribution for Y. 



3 This content is available online at <http://cnx.Org/content/m24531/l.4/>. 
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Exercise 15.3 (Solution on p. 482.) 

(See Exercise 5 (Exercise 8.5) from "Problems on Random Variables and Joint Distributions") 
Suppose a pair of dice is rolled. Let X be the total number of spots which turn up. Roll the pair 
an additional X times. Let Y be the number of sevens that are thrown on the X rolls. Determine 
the distribution for Y What is the probability of three or more sevens? 

Exercise 15.4 (Solution on p. 483.) 

(See Example 7 (Example 14.7: A random number JV of Bernoulli trials) from "Conditional 
Expectation, Regression") A number X is chosen by a random selection from the integers 1 through 
20 (say by drawing a card from a box). A pair of dice is thrown X times. Let Y be the number of 
"matches" (i.e., both ones, both twos, etc.). Determine the distribution for Y 

Exercise 15.5 (Solution on p. 484.) 

(See Exercise 20 (Exercise 14.20) from "Problems on Conditional Expectation, Regression") A 
number X is selected randomly from the integers 1 through 100. A pair of dice is thrown X times. 
Let Y be the number of sevens thrown on the X tosses. Determine the distribution for Y Determine 
E[Y] and P (Y < 20). 

Exercise 15.6 (Solution on p. 484.) 

(See Exercise 21 (Exercise 14.21) from "Problems on Conditional Expectation, Regression") A 
number X is selected randomly from the integers 1 through 100. Each of two people draw X 
times independently and randomly a number from 1 to 10. Let Y be the number of matches (i.e., 
both draw ones, both draw twos, etc.). Determine the distribution for Y Determine E [Y] and 
P{Y < 10). 

Exercise 15.7 (Solution on p. 484.) 

Suppose the number of entries in a contest is N ~ binomial (20, 0.4). There are four questions. 
Let Y; be the number of questions answered correctly by the ith contestant. Suppose the Y,- are 
iid, with common distribution 



Y=[1234] PY = [0.2 0.4 0.3 0.1] 



(15.82) 



Let D be the total number of correct answers. Determine E [D] , Var [D], P (15 < D < 25), and 
P(10<D< 30). 

Exercise 15.8 (Solution on p. 485.) 

Game wardens are making an aerial survey of the number of deer in a park. The number of herds 
to be sighted is assumed to be a random variable N ~ binomial (20, 0.5). Each herd is assumed to 
be from 1 to 10 in size, with probabilities 



Value 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


Probability 


0.05 


0.10 


0.15 


0.20 


0.15 


0.10 


0.10 


0.05 


0.05 


0.05 



Table 15.1 

Let D be the number of deer sighted under this model. Determine P (D < t) for t = 
25, 50, 75, 100 
and P(D > 90). 

Exercise 15.9 (Solution on p. 485.) 

A supply house stocks seven popular items. The table below shows the values of the items and the 
probability of each being selected by a customer. 



Value 


12.50 


25.00 


30.50 


40.00 


42.50 


50.00 


60.00 


Probability 


0.10 


0.15 


0.20 


0.20 


0.15 


0.10 


0.10 
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Table 15.2 

Suppose the purchases of customers are iid, and the number of customers in a day is binomial 
(10,0.5). Determine the distribution for the total demand D. 

a. How many different possible values are there? What is the maximum possible total sales? 

b. Determine E [D] and P (D < t) for t = 100, 150, 200, 250, 300. 
Determine P (100 < D < 200). 

Exercise 15.10 (Solution on p. 486.) 

A game is played as follows: 

1. A wheel is spun, giving one of the integers through 9 on an equally likely basis. 

2. A single die is thrown the number of times indicated by the result of the spin of the wheel. 
The number of points made is the total of the numbers turned up on the sequence of throws 
of the die. 

3. A player pays sixteen dollars to play; a dollar is returned for each point made. 

Let Y represent the number of points made and X = Y — 16 be the net gain (possibly negative) of 
the player. Determine the maximum value of 

X, E [X], Var [X], P {X > 0), P {X > = 10), P (X > = 16). 

Exercise 15.11 (Solution on p. 486.) 

Marvin calls on four customers. With probability p\ = 0.6 he makes a sale in each case. Geraldine 
calls on five customers, with probability p 2 = 0.5 of a sale in each case. Customers who buy do so 
on an iid basis, and order an amount Y; (in dollars) with common distribution: 

Y = [200 220 240 260 280 300] PY = [0.10 0.15 0.25 0.25 0.15 0.10] (15.83) 

Let Dj be the total sales for Marvin and T>2 the total sales for Geraldine. Let D = D\ + D%. 
Determine the distribution and mean and variance for Dj, D2, and D. Determine P (D\ > D2) and 
P(D> 1500), P{D > 1000), and P(D > 750). 

Exercise 15.12 (Solution on p. 487.) 

A questionnaire is sent to twenty persons. The number who reply is a random number N ~ 
binomial (20, 0.7). If each respondent has probability p = 0.8 of favoring a certain proposition, 
what is the probability of ten or more favorable replies? Of fifteen or more? 

Exercise 15.13 (Solution on p. 487.) 

A random number JV of students take a qualifying exam. A grade of 70 or more earns a pass. 
Suppose N ~ binomial (20, 0.3). If each student has probability p = 0.7 of making 70 or more, 
what is the probability all will pass? Ten or more will pass? 

Exercise 15.14 (Solution on p. 487.) 

Five hundred questionnaires are sent out. The probability of a reply is 0.6. The probability that 

a reply will be favorable is 0.75. What is the probability of at least 200, 225, 250 favorable replies? 

Exercise 15.15 (Solution on p. 488.) 

Suppose the number of Japanese visitors to Florida in a week is ATI ~ Poisson (500) and the 
number of German visitors is N2 ~ Poisson (300). If 25 percent of the Japanese and 20 percent of 
the Germans visit Disney World, what is the distribution for the total number D of German and 
Japanese visitors to the park? Determine P (D > k) for k = 150, 155, • • • , 245, 250. 

Exercise 15.16 (Solution on p. 488.) 

A junction point in a network has two incoming lines and two outgoing lines. The number of 

incoming messages N 2 on line one in one hour is Poisson (50); on line 2 the number is N 2 ~ Poisson 

(45). On incoming line 1 the messages have probability p\ a = 0.33 of leaving on outgoing line a 
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and 1 — pi a of leaving on line b. The messages coming in on line 2 have probability -pia = 0.47 of 
leaving on line a. Under the usual independence assumptions, what is the distribution of outgoing 
messages on line a? What are the probabilities of at least 30, 35, 40 outgoing messages on line a? 

Exercise 15.17 (Solution on p. 488.) 

A computer store sells Macintosh, HP, and various other IBM compatible personal computers. It 
has two major sources of customers: 

1. Students and faculty from a nearby university 

2. General customers for home and business computing. Suppose the following assumptions are 
reasonable for monthly purchases. 

• The number of university buyers JV1 ~ Poisson (30). The probabilities for Mac, HP, others 
are 0.4, 0.2, 0.4, respectively. 

• The number of non-university buyers N2 ~ Poisson (65). The respective probabilities for 
Mac, HP, others are 0.2, 0.3, 0.5. 

• For each group, the composite demand assumptions are reasonable, and the two groups buy 
independently. 

What is the distribution for the number of Mac sales? What is the distribution for the total number 
of Mac and Dell sales? 

Exercise 15.18 (Solution on p. 488.) 

The number JV of "hits" in a day on a Web site on the internet is Poisson (80). Suppose the 
probability is 0.10 that any hit results in a sale, is 0.30 that the result is a request for information, 
and is 0.60 that the inquirer just browses but does not identify an interest. What is the probability 
of 10 or more sales? What is the probability that the number of sales is at least half the number of 
information requests (use suitable simple approximations)? 

Exercise 15.19 (Solution on p. 489.) 

The number JV of orders sent to the shipping department of a mail order house is Poisson (700). 
Orders require one of seven kinds of boxes, which with packing costs have distribution 



Cost (dollars) 


0.75 


1.25 


2.00 


2.50 


3.00 


3.50 


4.00 


Probability 


0.10 


0.15 


0.15 


0.25 


0.20 


0.10 


0.05 



Table 15.3 



What is the probability the total cost of the $2.50 boxes is no greater than $475? What is the 
probability the cost of the $2.50 boxes is greater than the cost of the $3.00 boxes? What is the 
probability the cost of the $2.50 boxes is not more than $50.00 greater than the cost of the $3.00 
boxes? Suggestion. Truncate the Poisson distributions at about twice the mean value. 

Exercise 15.20 (Solution on p. 489.) 

One car in 5 in a certain community is a Volvo. If the number of cars passing a traffic check point 
in an hour is Poisson (130), what is the expected number of Volvos? What is the probability of at 
least 30 Volvos? What is the probability the number of Volvos is between 16 and 40 (inclusive)? 

Exercise 15.21 (Solution on p. 489.) 

A service center on an interstate highway experiences customers in a one-hour period as follows: 

• Northbound: Total vehicles: Poisson (200). Twenty percent are trucks. 

• Southbound: Total vehicles: Poisson (180). Twenty five percent are trucks. 
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- Each truck has one or two persons, with respective probabilities 0.7 and 0.3. 

- Each car has 1, 2, 3, 4, or 5 persons, with probabilities 0.3, 0.3, 0.2, 0.1, 0.1, respectively 

Under the usual independence assumptions, let D be the number of persons to be served. Determine 
E [D], Var [D], and the generating function go (s). 

Exercise 15.22 (Solution on p. 490.) 

The number N of customers in a shop in a given day is Poisson (120). Customers pay with cash or 
by MasterCard or Visa charge cards, with respective probabilties 0.25, 0.40, 0.35. Make the usual 
independence assumptions. Let JVj, JV 2 , JV 3 be the numbers of cash sales, MasterCard charges, Visa 
card charges, respectively. Determine P (JVi > 30), P {N 2 > 60), P (N 3 > 50), and P {N 2 > N 3 ). 

Exercise 15.23 (Solution on p. 490.) 

A discount retail store has two outlets in Houston, with a common warehouse. Customer requests 
are phoned to the warehouse for pickup. Two items, a and b, are featured in a special sale. The 
number of orders in a day from store A is Na ~ Poisson (30); from store B, the nember of orders 
is N B ~ Poisson (40). 

For store A, the probability an order for a is 0.3, and for b is 0.7. 

For store B, the probability an order for a is 0.4, and for b is 0.6. What is the probability the 
total order for item b in a day is 50 or more? 

Exercise 15.24 (Solution on p. 490.) 

The number of bids on a job is a random variable N ~ binomial (7, 0.6). Bids (in thousands of 
dollars) are iid with Y uniform on [3, 5]. What is the probability of at least one bid of $3,500 or 
less? Note that "no bid" is not a bid of 0. 

Exercise 15.25 (Solution on p. 491.) 

The number of customers during the noon hour at a bank teller's station is a random number JV 
with distribution 



N = 1 : 10, PN = 0.01 * [5 7 10 11 12 13 12 11 10 9] 



(15.84) 



The amounts they want to withdraw can be represented by an iid class having the common 
distribution Y ~ exponential (0.01). Determine the probabilities that the maximum withdrawal is 
less than or equal to t for t = 100, 200, 300, 400, 500. 

Exercise 15.26 (Solution on p. 491.) 

A job is put out for bids. Experience indicates the number JV of bids is a random variable having 
values through 8, with respective probabilities 



Value 





1 


2 


3 


4 


5 


6 


7 


8 


Probability 


0.05 


0.10 


0.15 


0.20 


0.20 


0.10 


0.10 


0.07 


0.03 



Table 15.4 

The market is such that bids (in thousands of dollars) are iid, uniform [100, 200]. Determine 
the probability of at least one bid of $125,000 or less. 

Exercise 15.27 (Solution on p. 491.) 

A property is offered for sale. Experience indicates the number JV of bids is a random variable 
having values through 10, with respective probabilities 



Value 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


Probability 


0.05 


0.15 


0.15 


0.20 


0.10 


0.10 


0.05 


0.05 


0.05 


0.05 


0.05 
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Table 15.5 

The market is such that bids (in thousands of dollars) are iid, uniform [150, 200] Determine the 
probability of at least one bid of $180,000 or more. 

Exercise 15.28 (Solution on p. 491.) 

A property is offered for sale. Experience indicates the number JV of bids is a random variable 
having values through 8, with respective probabilities 



Number 





1 


2 


3 


4 


5 


6 


7 


8 


Probability 


0.05 


0.15 


0.15 


0.20 


0.15 


0.10 


0.10 


0.05 


0.05 



Table 15.6 

The market is such that bids (in thousands of dollars) are iid symmetric triangular on [150 250]. 
Determine the probability of at least one bid of $210,000 or more. 

Exercise 15.29 (Solution on p. 491.) 

Suppose N ~ binomial (10, 0.3) and the Y,- are iid, uniform on [10, 20]. Let V be the minimum 
of the JV values of the Yj. Determine P (V > t) for integer values from 10 to 20. 

Exercise 15.30 (Solution on p. 492.) 

Suppose a teacher is equally likely to have 0, 1, 2, 3 or 4 students come in during office hours on a 
given day. If the lengths of the individual visits, in minutes, are iid exponential (0.1), what is the 
probability that no visit will last more than 20 minutes. 

Exercise 15.31 (Solution on p. 492.) 

Twelve solid-state modules are installed in a control system. If the modules are not defective, 
they have practically unlimited life. However, with probability p = 0.05 any unit could have a 
defect which results in a lifetime (in hours) exponential (0.0025). Under the usual independence 
assumptions, what is the probability the unit does not fail because of a defective module in the first 
500 hours after installation? 

Exercise 15.32 (Solution on p. 492.) 

The number JV of bids on a painting is binomial (10, 0.3). The bid amounts (in thousands of 
dollars) Y,- form an iid class, with common density function fy (t) = 0.005 (37 — 2t)2 < t < 10. 
What is the probability that the maximum amount bid is greater than $5,000? 

Exercise 15.33 (Solution on p. 493.) 

A computer store offers each customer who makes a purchase of $500 or more a free chance 
at a drawing for a prize. The probability of winning on a draw is 0.05. Suppose the times, in 
hours, between sales qualifying for a drawing is exponential (4). Under the usual independence 
assumptions, what is the expected time between a winning draw? What is the probability of three 
or more winners in a ten hour day? Of five or more? 

Exercise 15.34 (Solution on p. 493.) 

Noise pulses arrrive on a data phone line according to an arrival process such that for each t > the 
number N t of arrivals in time interval (0, t], in hours, is Poisson (It). The ith pulse has an "intensity" 
Yj such that the class {Y : 1 < i} is iid, with the common distribution function Fy (u) = 1 — e~ 2u 
for u > 0. Determine the probability that in an eight-hour day the intensity will not exceed two. 

Exercise 15.35 (Solution on p. 493.) 

The number JV of noise bursts on a data transmission line in a period (0,t\ is Poisson (nt). The 
number of digit errors caused by the ith burst is Yj, with the class {Y : 1 < i} iid, Y — 1 ~ 
geometric (p). An error correcting system is capable or correcting five or fewer errors in any burst. 
Suppose )i = 12 and p = 0.35. What is the probability of no uncorrected error in two hours of 
operation? 
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Solutions to Exercises in Chapter 15 

Solution to Exercise 15.1 (p. 476) 

PX = [0 (l/6)*ones(l,6)] ; 
PY = [0.5 0.5]; 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN PX 
Enter gen fn COEFFICIENTS for gY PY 
Results are in N, PN, Y, PY, D, PD, P 
May use jcalc or jcalcf on N, D, P 
To view the distribution, call for gD. 

'/, Compare with P8-3 



disp(gD) 







0.1641 


1.0000 


0.3125 


2.0000 


0.2578 


3.0000 


0.1667 


4.0000 


0.0755 


5.0000 


0.0208 


6.0000 


0.0026 



Solution to Exercise 15.2 (p. 476) 

PN = (1/36)* [0 01234565432 1]; 
PY = [0.5 0.5]; 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN PN 
Enter gen fn COEFFICIENTS for gY PY 
Results are in N, PN, Y, PY, D, PD, P 
May use jcalc or jcalcf on N, D, P 
To view the distribution, call for gD. 
disp(gD) 








0, 


.0269 


1. 


,0000 


0. 


.1025 


2. 


.0000 


0, 


,1823 


3. 


,0000 


0. 


,2158 


4. 


,0000 


0, 


,1954 


5 


,0000 


0. 


,1400 


6. 


,0000 


0, 


,0806 


7. 


,0000 


0. 


.0375 


8. 


,0000 


0, 


.0140 


9. 


,0000 


0, 


.0040 


10. 


,0000 


0, 


.0008 


11. 


,0000 


0. 


.0001 


12. 


,0000 


0, 


.0000 



'/, (Continued next page) 



Solution to Exercise 15.3 (p. 477) 



PX 



(1/36)* [0 01234565432 1]; 
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PY = [5/6 1/6] ; 

gend 

Do not forget zero coefficients for missing powers 

Enter gen fn COEFFICIENTS for gN PX 

Enter gen fn COEFFICIENTS for gY PY 

Results are in N, PN, Y, PY, D, PD, P 

May use jcalc or jcalcf on N, D, P 

To view the distribution, call for gD. 

disp(gD) 

0.3072 

0.3660 

0.2152 

0.0828 

0.0230 

0.0048 

0.0008 

0.0001 

0.0000 

0.0000 

0.0000 

0.0000 

0.0000 





1.0000 

2.0000 

3.0000 

4.0000 

5.0000 

6.0000 

7.0000 

8.0000 

9.0000 
10.0000 
11.0000 
12.0000 
P = (D>=3)*PD' 
= 0.1116 



Solution to Exercise 15.4 (p. 477) 

gN = (1/20) *[0 ones(l,20)]; 
gY = [5/6 1/6] ; 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN gN 
Enter gen fn COEFFICIENTS for gY gY 
Results are in N, PN, Y, PY, D, PD, P 
May use jcalc or jcalcf on N, D, P 
To view the distribution, call for gD. 



disp(gD) 







0.2435 


1.0000 


0.2661 


2.0000 


0.2113 


3.0000 


0.1419 


4.0000 


0.0795 


5.0000 


0.0370 


6.0000 


0.0144 


7.0000 


0.0047 


8.0000 


0.0013 


9.0000 


0.0003 


10.0000 


0.0001 


11.0000 


0.0000 


12.0000 


0.0000 


13.0000 


0.0000 
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14. 


,0000 


0. 


,0000 


15. 


,0000 


0. 


,0000 


16. 


,0000 


0. 


,0000 


17. 


,0000 


0. 


,0000 


18. 


,0000 


0. 


,0000 


19. 


,0000 


0. 


,0000 


20. 


,0000 


0. 


,0000 



Solution to Exercise 15.5 (p. 477) 

gN = 0.01*[0 ones (1,100)]; 
gY = [5/6 1/6] ; 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN gN 
Enter gen fn COEFFICIENTS for gY gY 
Results are in N, PN, Y, PY, D, PD, P 
May use jcalc or jcalcf on N, D, P 
To view the distribution, call for gD. 
EY = dot(D,PD) 
EY = 8.4167 
P20 = (D<=20)*PD' 
P20 = 0.9837 

Solution to Exercise 15.6 (p. 477) 

gN = 0.01*[0 ones(l,100)]; 
gY = [0.9 0.1]; 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN gN 
Enter gen fn COEFFICIENTS for gY gY 
Results are in N, PN, Y, PY, D, PD, P 
May use jcalc or jcalcf on N, D, P 
To view the distribution, call for gD. 
EY = dot(D,PD) 
EY = 5.0500 
P10 = (D<=10)*PD' 
P10 = 0.9188 

Solution to Exercise 15.7 (p. 477) 



gN = ibinom(20, 0.4,0: 20) ; 
gY = 0.1* [0 2 4 3 1] ; 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN gN 
Enter gen fn COEFFICIENTS for gY gY 
Results are in N, PN, Y, PY, D, PD, P 
May use jcalc or jcalcf on N, D, P 
To view the distribution, call for gD. 
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ED = 


dot(D,PD) 


ED = 


18.4000 


VD = 


(D.~2)*PD' - ED~2 


VD = 


31.8720 


PI = 


((15<=D)&(D<=25))*PD 


PI = 


0.6386 


P2 = 


( ( 10<=D) & (D<=30) ) *PD 


P2 = 


0.9290 



Solution to Exercise 15.8 (p. 477) 

gN = ibinom(20, 0.5,0: 20) ; 
gY = 0.01*[0 5 10 15 20 15 10 10 5 5 5] ; 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN gN 
Enter gen fn COEFFICIENTS for gY gY 
Results are in N, PN, Y, PY, D, PD, P 
May use jcalc or jcalcf on N, D, P 
To view the distribution, call for gD. 
k = [25 50 75 100] ; 
P = zeros(l,4) ; 
for i = 1:4 

P(i) = (D<=k(i))*PD'; 
end 
disp(P) 

0.0310 0.5578 0.9725 0.9998 

Solution to Exercise 15.9 (p. 477) 

gN = ibinom(10, 0.5, 0:10) ; 
Y = [12.5 25 30.5 40 42.5 50 60]; 
PY = 0.01* [10 15 20 20 15 10 10]; 
mgd 

Enter gen fn COEFFICIENTS for gN gN 
Enter VALUES for Y Y 
Enter PROBABILITIES for Y PY 

Values are in row matrix D; probabilities are in PD. 
To view the distribution, call for mD. 
s = size(D) 
s = 1 839 
M = max(D) 
M = 590 

t = [100 150 200 250 300] ; 
P = zeros(l,5) ; 
for i = 1:5 

P(i) = (D<=t(i))*PD>; 
end 
disp(P) 

0.1012 0.3184 0.6156 0.8497 0.9614 
PI = ((100<D)&(D<=200))*PD' 
PI = 0.5144 
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Solution to Exercise 15.10 (p. 478) 



gn = 0. l*ones(l,10) ; 
gy = (1/6)* [0 ones (1,6)]; 
[Y,PY] = gendf(gn,gy); 
[X,PX] = csort(Y-16,PY); 
M = max(X) 
M = 38 
EX = dot(X.PX) 
EX = -0.2500 
VX = dot(X."2,PX) - EX~2 
VX = 114.1875 
Ppos = (X>0)*PX' 
Ppos = 0.4667 
P10 = (X>=10)*PX' 
P10 = 0.2147 
P16 = (X>=16)*PX' 
P16 = 0.0803 



'/. Check EX = En*Ey - 16 = 4.5*3.5 
'/. 4.5*3.5 - 16 = -0.25 



Solution to Exercise 15.11 (p. 478) 



gnM = ibinom(4,0.6,0:4) ; 
gnG = ibinom(5,0 . 5,0 : 5) ; 
Y = 200:20:300; 

PY = 0.01* [10 15 25 25 15 10]; 
[D1,PD1] = mgdf(gnM,Y,PY); 
[D2.PD2] = mgdf(gnG,Y,PY); 
EDI = dot(Dl,PDl) 
EDI = 600.0000 
VD1 = dot(Dl.~2,PDl) - ED1-2 
VD1 = 6.1968e+04 

ED2 = dot(D2,PD2) 
ED2 = 625.0000 
VD2 = dot(D2.~2,PD2) - ED2~2 
VD2 = 8.0175e+04 

[Dl,D2,t,u,PDl,PD2,P] = icalcf (D1,D2,PD1,PD2); 
Use array opertions on matrices X, Y, PX, PY, t, u, and P 
[D,PD] = csort(t+u,P) ; 
ED = dot(D.PD) 
ED = 1.2250e+03 
eD = EDI + ED2 '/. Check: ED = EDI + ED2 



7, Check: EDI = EnM*EY = 2.4*250 



'/. Check: ED2 = EnG*EY = 2.5*250 



eD = 



1.2250e+03 



'/, (Continued next page) 



VD = dot(D.~2,PD) - ED~2 
VD = 1.4214e+05 
vD = VD1 + VD2 
vD = 1.4214e+05 
Plg2 = total ( (t>u) .*P) 
Plg2 = 0.4612 
k = [1500 1000 750]; 
PDk = zeros (1,3) ; 



'/. Check: VD = VD1 + VD2 
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for i = 1:3 

PDk(i) = (D>=k(i))*PD'; 
end 
disp(PDk) 

0.2556 0.7326 0.8872 

Solution to Exercise 15.12 (p. 478) 



gN = ibinom(20, 0.7,0: 20) ; 
gY = [0.2 0.8]; 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN gN 
Enter gen fn COEFFICIENTS for gY gY 
Results are in N, PN, Y, PY, D, PD, P 
May use jcalc or jcalcf on N, D, P 
To view the distribution, call for gD. 
P10 = (D>=10)*PD' 
P10 = 0.7788 
P15 = (D>=15)*PD' 
P15 = 0.0660 

pD = ibinom(20, 0.7*0. 8,0: 20) ; '/. Alternate: use D binomial (ppO) 
D = 0:20; 
plO = (D>=10)*pD' 
plO = 0.7788 
pl5 = (D>=15)*pD' 
pl5 = 0.0660 

Solution to Exercise 15.13 (p. 478) 

gN = ibinom(20, 0.3,0: 20) ; 
gY = [0.3 0.7]; 
gend 

Do not forget zero coefficients for missing powers 
Enter gen fn COEFFICIENTS for gN gN 
Enter gen fn COEFFICIENTS for gY gY 
Results are in N, PN, Y, PY, D, PD, P 
May use jcalc or jcalcf on N, D, P 
To view the distribution, call for gD. 
Pall = (D==20)*PD' 
Pall = 2.7822e-14 

pall = (0.3*0. 7) ~20 '/. Alternate: use D binomial (ppO) 

pall = 2.7822e-14 
PlO = (D >= 10)*PD' 
PlO = 0.0038 

Solution to Exercise 15.14 (p. 478) 

n = 500; 
p = 0.6; 
pO = 0.75; 
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D = 0:500; 

PD = ibinom(500,p*pO,D) ; 

k = [200 225 250] ; 

P = zeros(l,3) ; 

for i = 1:3 

P(i) = (D>=k(i))*PD'; 
end 
disp(P) 

0.9893 0.5173 0.0140 

Solution to Exercise 15.15 (p. 478) 

JD ~ Poisson (500*0.25 = 125); GD ~ Poisson (300*0.20 = 60); D ~ Poisson (185) 



k = 150 


:5:250; 


PD = cpoisson(185,k) ; 


disp([k;PD] ; 


') 


150.0000 


0.9964 


155.0000 


0.9892 


160.0000 


0.9718 


165.0000 


0.9362 


170.0000 


0.8736 


175.0000 


0.7785 


180.0000 


0.6532 


185.0000 


0.5098 


190.0000 


0.3663 


195.0000 


0.2405 


200.0000 


0.1435 


205.0000 


0.0776 


210.0000 


0.0379 


215.0000 


0.0167 


220.0000 


0.0067 


225.0000 


0.0024 


230.0000 


0.0008 


235.0000 


0.0002 


240.0000 


0.0001 


245.0000 


0.0000 


250.0000 


0.0000 



Solution to Exercise 15.16 (p. 478) 



mla = 50*0.33; m2a = 45*0.47; ma = mla + m2a; 
PNa = cpoissonCma, [30 35 40]) 
PNa = 0.9119 0.6890 0.3722 



Solution to Exercise 15.17 (p. 479) 

Mac sales Poisson (30*0.4 + 65*0.2 = 25); HP sales Poisson (30*0.2 + 65*0.3 = 25.5); total Mac plus HP 
sales Poisson(50.5). 
Solution to Exercise 15.18 (p. 479) 



X = 0:30; 
Y = 0:80; 
PX = ipoisson(80*0.1,X) ; 
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PY = ipoisson(80*0.3,Y) ; 
icalc: X Y PX PY 

PX10 = (X>=10)*PX' '/. Approximate calculation 

PX10 = 0.2834 

pXlO = cpoisson(8, 10) '/, Direct calculation 

pXlO = 0.2834 

M = t>=0.5*u; 

PM = total(M.*P) 

PM = 0.1572 

Solution to Exercise 15.19 (p. 479) 

X = 0:400; 
Y = 0:300; 

PX = ipoisson(700*0.25,X) ; 
PY = ipoisson(700*0.20,Y); 
icalc 

Enter row matrix of X-values X 
Enter row matrix of Y-values Y 
Enter X probabilities PX 
Enter Y probabilities PY 

Use array operations on matrices X, Y, PX, PY, t, u, and P 
PI = (2.5*X<=475)*PX' 
PI = 0.8785 
M = 2.5*t<=(3*u + 50) ; 
PM = total(M.*P) 
PM = 0.7500 

Solution to Exercise 15.20 (p. 479) 

PI = cpoisson(130*0.2,30) = 0.2407 
P2 = cpoisson(26,16) - cpoisson(26,41) = 0.9819 

Solution to Exercise 15.21 (p. 479) 

T ~ Poisson (200*0.2 + 180*0.25 = 85), P ~ Poisson (200*0.8 + 180*0.75 = 295) 

a = 85 
b = 200*0.8 + 180*0.75 
b = 295 
YT = [12]; 
PYT = [0.7 0.3]; 
EYT = dot (YT, PYT) 
EYT = 1.3000 

VYT = dot (YT. -2, PYT) - EYT~2 
VYT = 0.2100 
YP = 1:5; 

PYP = 0.1* [3 3 2 1 1] ; 
EYP = dot (YP, PYP) 
EYP = 2.4000 

VYP = dot (YP. -2, PYP) - EYP-2 
VYP = 1.6400 
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EDT = 85*EYT 

EDT = 110.5000 

EDP = 295+EYP 

EDP = 708.0000 

ED = EDT + EDP 

ED = 818.5000 

VT = 85*(VYT + EYT~2) 

VT = 161.5000 

VP = 295* (VYP + EYP~2) 

VP = 2183 

VD = VT + VP 

VD = 2.2705e+03 



NT = 0:180; 

gNT = ipoisson(85,NT) ; 

gYT = 0.1* [0 7 3] ; 

[DT.PDT] = gendf (gNT, gYT); 

EDT = dot(DT.PDT) 

EDT = 110.5000 

VDT = dot(DT.~2,PDT) - EDT~2 

VDT = 161.5000 

NP = 0:500; 

gNP = ipoisson(295,NP) ; 

gYP = 0.1* [0 3 2 2 1 1] ; 

[DP.PDP] = gendf (gNP, gYP); 



°/. Possible alternative 



7, Requires too much memory 



9DT ( s ) = exp (85 (0.7s + 0.3s 2 - l)) g DP (s) = exp (295 (0.1 (3s + 3s 2 2s 3 + s 4 + s 5 ) - l)) (15.85) 



9d (s) = 9dt (s) gDP (s) 



(15.86) 



Solution to Exercise 15.22 (p. 480) 



X = 0:120; 
PX = ipoisson(120*0.4,X) ; 
Y = 0:120; 

PY = ipoisson(120*0.35,Y); 
icalc 

Enter row matrix of X values X 
Enter row matrix of Y values Y 
Enter X probabilities PX 
Enter Y probabilities PY 

Use array opertions on matrices X, Y, PX, PY, t, u, and P 
M = t > u; 
PM = total(M.*P) 
PM = 0.7190 

Solution to Exercise 15.23 (p. 480) 

P = cpoisson(30*0.7+40*0.6,50) = 0.2468 
Solution to Exercise 15.24 (p. 480) 
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'/. First solution --- FY(t) = 1 - gN[P(Y>t)] 
P = l-(0.4 + 0.6*0.75)~7 
P = 0.6794 

°/. Second solution Positive number of satisfactory bids, 

'/, i.e. the outcome is indicator for event E, with P(E) =0.25 

pN = ibinom(7,0.6,0:7) ; 

gY = [3/4 1/4] ; '/, Generator function for indicator 

[D,PD] = gendf (pN,gY) ; °/ D is number of successes 

Pa = (D>0)*PD' °/. D>0 means at least one successful bid 

Pa = 0.6794 

Solution to Exercise 15.25 (p. 480) 

Use F w (t) = g N [P (Y < T)] 

gN = 0.01*[0 5 7 10 11 12 13 12 11 10 9]; 
t = 100:100:500; 
PY = 1 - exp(-0.01*t); 

FW = polyval(f liplr(gN) ,PY) '/, fliplr puts coeficients in 

'/, descending order of powers 
FW = 0.1330 0.4598 0.7490 0.8989 0.9615 

Solution to Exercise 15.26 (p. 480) 

Probability of a successful bid PY = (125 - 100) /100 = 0.25 

PY =0.25; 
gN = 0.01* [5 10 15 20 20 10 10 7 3] ; 
P = 1 - polyval(fliplr(gN),PY) 
P = 0.9116 

Solution to Exercise 15.27 (p. 480) 

Consider a sequence of JV trials with probabiliy p = (180 — 150) /50 = 0.6. 

gN = 0.01* [5 15 15 20 10 10 5 5 5 5 5] ; 
gY = [0.4 0.6]; 
[D,PD] = gendf (gN.gY); 
P = (D>0)*PD' 
P = 0.8493 

Solution to Exercise 15.28 (p. 481) 

gN = 0.01* [5 15 15 20 15 10 10 5 5] ; 
PY = 0.5 + 0.5*(1 - (4/5) -2) 
PY = 0.6800 

> PW = 1 - polyval (fliplr (gN),PY) 
PW = 0.6536 
"/.alternate 
gY = [0.68 0.32] ; 
[D,PD] = gendf (gN.gY); 
P = (D>0)*PD' 
P = 0.6536 

Solution to Exercise 15.29 (p. 481) 
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gN = ibinom(10, 0.3, 0:10) ; 
t = 10:20; 
p = 0.1*(20 - t); 

P = polyval(fliplr(gN),p) - 0.7-10 
P = 

Columns 1 through 7 

0.9718 0.7092 0.5104 0.3612 0.2503 0.1686 0.1092 

Columns 8 through 11 

0.0664 0.0360 0.0147 

Pa = (0.7 + 0.3*p).~10 - 0.7^10 '/. Alternate form of gN 

Pa = 

Columns 1 through 7 

0.9718 0.7092 0.5104 0.3612 0.2503 0.1686 0.1092 

Columns 8 through 11 

0.0664 0.0360 0.0147 

Solution to Exercise 15.30 (p. 481) 

gN = 0.2*ones(l,5) ; 
p = 1 - exp(-2) ; 
FW = polyval(fliplr(gN) ,p) 
FW = 0.7635 

gY = [p 1-p] ; '/. Alternate 

[D,PD] = gendf (gN,gY); 
PW = (D==0)*PD' 
PW = 0.7635 

Solution to Exercise 15.31 (p. 481) 

p = 1 - exp(-0. 0025*500) ; 
FW = (0.95 + 0.05*p)~12 
FW = 0.8410 
gN = ibinom(12, 0.05, 0:12) ; 
gY = [p 1-p] ; 
[D,PD] = gendf (gN,gY); 
PW = (D==0)*PD' 
PW = 0.8410 

Solution to Exercise 15.32 (p. 481) 

P(Y < 5) = 0.005 / (37 - 2t) dt = 0.45 (15.87) 



p = 0.45; 
P = 1 - (0.7 + 0.3*p)~10 
P = 0.8352 

gN = ibinom(10, 0.3, 0:10) ; 
gY = [p 1-p] ; 

[D,PD] = gendf (gN,gY) ; '/, D is number of "successes" 
Pa = (D>0)*PD' 
Pa = 0.8352 
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Solution to Exercise 15.33 (p. 481) 

N t ~ Poisson (\t), Not ~ Poisson (Xpt), Wot exponential (Xp). 

p = 0.05; 
t = 10; 
lambda = 4; 
EW = l/(lambda*p) 
EW = 5 

PND10 = cpoisson(lambda*p*t, [3 5]) 
PND10 = 0.3233 0.0527 

Solution to Exercise 15.34 (p. 481) 

JV 8 is Poisson (7*8 = 56) g N (s) = e 56 ^ 1 ). 

t = 2; 
FW2 = exp(56*(l - exp(-t~2) - 1)) 
FW2 = 0.3586 

Solution to Exercise 15.35 (p. 481) 

F w (k) = g N [P (Y < k)]P (Y < k) - 1 - q k ^N t ~ Poisson (12t) 

q = 1 - 0.35; 
k = 5; 
t = 2; 
mu = 12; 

FW = exp(mu*t*(l - q~(k-l) - 1)) 
FW = 0.0138 
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Chapter 16 

Conditional Independence, Given a 
Random Vector 

16.1 Conditional Independence, Given a Random Vector 1 

In the unit on Conditional Independence (Section 5.1) , the concept of conditional independence of events 
is examined and used to model a variety of common situations. In this unit, we investigate a more general 
concept of conditional independence, based on the theory of conditional expectation. This concept lies at the 
foundations of Bayesian statistics, of many topics in decision theory, and of the theory of Markov systems. 
We examine in this unit, very briefly, the first of these. In the unit on Markov Sequences (Section 16.2), we 
provide an introduction to the third. 

16.1.1 The concept 

The definition of conditional independence of events is based on a product rule which may be expressed in 
terms of conditional expectation, given an event. The pair {^4, B} is conditionally independent, given C, iff 

E[I A I B \C] = P(AB\C) = P(A\C)P(B\C)=E[I A \C]E[I B \C] (16.1) 

If we let A = X- 1 (M) and B = Y" 1 (JV), then I A = I M {X) and I B = I N (Y). It would be reasonable to 
consider the pair {X, Y} conditionally independent, given event C, iff the product rule 

E [I M (X) I N (Y) \C] = E [I M (X) \C] E [I N (Y) \C] (16.2) 

holds for all reasonable M and JV (technically, all Borel M and N). This suggests a possible extension to 
conditional expectation, given a random vector. We examine the following concept. 

Definition. The pair {X, Y} is conditionally independent, givenZ, designated {X, Y}ci\Z, iff 

E [I M {X) I N (Y) \Z] = E [I M {X) \Z\ E [I N (Y) \Z\ for all Borel M. N (16.3) 

Remark. Since it is not necessary that X, Y, or Z be real valued, we understand that the sets M and JV 
are on the codomains for X and Y, respectively. For example, if X is a three dimensional random vector, 
then M is a subset of _R 3 . 

As in the case of other concepts, it is useful to identify some key properties, which we refer to by the 
numbers used in the table in Appendix G. We note two kinds of equivalences. For example, the following 
are equivalent. 

(CI1) E [I M {X) I N (Y) \Z] = E [I M {X) \Z] E [I N (Y) \Z] a.s. for all Borel sets M, N 



1 This content is available online at <http://cnx.Org/content/m23813/l.5/>. 
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(CI5) E [g {X, Z) h (Y, Z) \Z\ = E[g {X, Z) \Z] E [h (Y, Z) \Z\ a.s. for all Borel functions g, h 
Because the indicator functions are special Borel functions, (CI1) (p. 496) is a special case of (CI5) 
(p. 495). To show that (CI1) (p. 496) implies (CI5) (p. 495), we need to use linearity, monotonicity, and 
monotone convergence in a manner similar to that used in extending properties (CE1) (p. 426) to (CE6) 
(p. 427) for conditional expectation. A second kind of equivalence involves various patterns. The properties 
(CI1) (p. 496), (CI2) (p. 496), (CI3) (p. 496), and (CI4) (p. 496) are equivalent, with (CI1) (p. 496) being 
the defining condition for {X, Y} ci \Z. 

(CI1) E [I M (X) I N (Y) \Z] = E [I M {X) \Z] E [I N (Y) \Z] a.s. for all Borel sets M, N 
(CI2) E [I M (X) \Z, Y] = E [I M (X) \Z] a.s. for all Borel sets M 
(CI3) E [I M (X) I Q (Z) \Z, Y]=E [I m {X) I Q (Z) \Z] a.s. for all Borel sets M, Q 
(CI4) E [I M (X) I Q (Z) \Y] = E{E [I M (X) Iq \z) \Z] \Y} a.s. for all Borel sets M, Q 
As an example of the kinds of argument needed to verify these equivalences, we show the equivalence of 
(CI1) (p. 496) and (CI2) (p. 496). 



• 



• 



(CI1) (p. 496) implies (CI2) (p. 496). Set ei (Y, Z) = E [I M {X) \Z, Y] and e 2 {Y, Z) = E [I M {X) \Z\. 
If we show 

E [I N (Y) I Q (Z) ei {Y, Z)\ = E [I N (Y) I Q (Z) e 2 (Y, Z)\ for all Borel N, Q (16.4) 

then by the uniqueness property (E5b) (list, p. 600) for expectation we may assert e\ (Y, Z) = 
e 2 (Y, Z) a.s. Using the defining property (CE1) (p. 426) for conditional expectation, we have 

E{I N (Y) I Q (Z) E [I M (X) \Z, Y}} = E [I N (Y) Iq (Z) I m (X)} (16.5) 

On the other hand, use of (CE1) (p. 426), (CE8) (p. 428), (CI1) (p. 496), and (CE1) (p. 426) yields 

E{I N (Y) I Q (Z) E [Im (X) \Z}} = E{I Q (Z) E [I N (Y) E [I M (X) \Z] \Z}} (16.6) 

= E{I Q (Z) E [I M (X) \Z] E [I N (Y) \Z}} = E{I Q (Z) E [I M (X) I N (Y) \Z}} (16.7) 

= E[I n (Y)Iq(Z)I m (X)} (16.8) 

which establishes the desired equality. 

(CI2) (p. 496) implies (CI1) (p. 496). Using (CE9) (p. 428), (CE8) (p. 428), (CI2) (p. 496), and 
(CE8) (p. 428), we have 

E [I M (X) I N (Y) \Z] = E{E [I M (X) I N (Y) \Z,Y] \Z} (16.9) 

= E{I N (Y) E [I M (X) \Z, Y] \Z} = E{I N (Y) E [I M (X) \Z] \Z} (16.10) 

= E[I M (X)[Z]E[I N (Y)[Z] (16.11) 

Use of property (CE8) (p. 428) shows that (CI2) (p. 496) and (CI3) (p. 496) are equivalent. Now just as 
(CI1) (p. 496) extends to (CI5) (p. 495), so also (CI3) (p. 496) is equivalent to 
(CI6) E [g {X, Z) \Z, Y} = E[g {X, Z) [Z] a.s. for all Borel functions g 
Property (CI6) (p. 496) provides an important interpretation of conditional independence: 
E[g(X, Z) \Z] is the best mean-square estimator for g(X, Z), given knowledge of Z. The condition 
{X, Y} ci \Z implies that additional knowledge about Y does not modify that best estimate. This interpre- 
tation is often the most useful as a modeling assumption. 
Similarly, property (CI4) (p. 496) is equivalent to 

(CI8) E [g (X, Z) \Y] = E{E [g {X, Z) \Z] \Y} a.s. for all Borel functions g 

Property (CI7) ("(CI7) ", p. 602) is an alternate way of expressing (CI6) (p. 496). Property (CI9) 
("(CI9) ", p. 602) is just a convenient way of expressing the other conditions. 

The additional properties in Appendix G (Section 17.7) are useful in a variety of contexts, particularly 
in establishing properties of Markov systems. We refer to them as needed. 
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16.1.2 The Bayesian approach to statistics 

In the classical approach to statistics, a fundamental problem is to obtain information about the population 
distribution from the distribution in a simple random sample. There is an inherent difficulty with this 
approach. Suppose it is desired to determine the population mean \i. Now /j, is an unknown quantity 
about which there is uncertainty. However, since it is a constant, we cannot assign a probability such as 
P (a < fi < b). This has no meaning. 

The Bayesian approach makes a fundamental change of viewpoint. Since the population mean is a 
quantity about which there is uncertainty, it is modeled as a random variable whose value is to be determined 
by experiment. In this view, the population distribution is conceived as randomly selected from a class of such 
distributions. One way of expressing this idea is to refer to a state of nature. The population distribution 
has been "selected by nature" from a class of distributions. The mean value is thus a random variable whose 
value is determined by this selection. To implement this point of view, we assume 

1. The value of the parameter (say \x in the discussion above) is a "realization" of a parameter random 
variable H. If two or more parameters are sought (say the mean and variance), they may be considered 
components of a parameter random vector. 

2. The population distribution is a conditional distribution, given the value of H. 

The Bayesian model 

If X is a random variable whose distribution is the population distribution and H is the parameter 
random variable, then {X, H} have a joint distribution. 

1. For each u in the range of H, we have a conditional distribution for X, given H = u. 

2. We assume a prior distribution for H. This is based on previous experience. 

3. We have a random sampling process, given H: i.e., {Xi : 1 < i < n} is conditionally iid, given H. Let 
W = (Xi, X2, ■ ■ ■ , X n ) and consider the joint conditional distribution function 

~-P(X 1 < t u X 2 <t 2 , ■■■X n < t n \H = u) (16.12) 



E 



Fw\H (*1) ^2: 


i "n 


n 

n'(-oo, ti ]pQ)itf= 


= U 


i=\ 





II E Vi-^M ( x i) \H = u}=H F xlH (t t \u) (16.13) 



i=l i=\ 

If X has conditional density, given H, then a similar product rule holds. 

Population proportion 

We illustrate these ideas with one of the simplest, but most important, statistical problems: that of 
determining the proportion of a population which has a particular characteristic. Examples abound. We 
mention only a few to indicate the importance. 

1. The proportion of a population of voters who plan to vote for a certain candidate. 

2. The proportion of a given population which has a certain disease. 

3. The fraction of items from a production line which meet specifications. 

4. The fraction of women between the ages eighteen and fifty five who hold full time jobs. 

The parameter in this case is the proportion p who meet the criterion. If sampling is at random, then the 
sampling process is equivalent to a sequence of Bernoulli trials. If H is the parameter random variable and 
S n is the number of "successes" in a sample of size n, then the conditional distribution for S n , given H = u, 
is binomial (n, u). To see this, consider 

Xi = I E ., with P{E l \H= u) = E[X l \H = u] = e{u) = u (16.14) 
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Anaysis is carried out for each fixed u as in the ordinary Bernoulli case. If 

n n 

S n = 2_\ Xi = /_j^Ei is the number of successes in n component trials (16.15) 

8=1 8=1 

we have the result 

E[l {k] {S t )\H = u] =P(S n = k\H = u) = C{n 7 k)u k {l-u) n ~ k and E\S n \H = u] = nu (16.16) 

The objective 

We seek to determine the best mean-square estimate of H, given S n = k. Two steps must be taken: 

1. If H = u, we know E [S n \H = u] = nu. Sampling gives S n = k. We make a Bayesian reversal to get 
an exression for E[H\S n = k]. 

2. To complete the task, we must assume a prior distribution for H on the basis of prior knowledge, if 
any. 

The Bayesian reversal 

Since {S n = k} is an event with positive probability, we use the definition of the conditional expectation, 
given an event, and the law of total probability (CElb) (p. 426) to obtain 

e\h\s =k] = - [ gJ{fc} (Sn) J = E{HE [ /{fc} {Sn) |g ] } = I uE [ /{fc} {Sn) |g = ^ fH {u) du (win 

n E [I {k} (S n )} E{E [I {k} (S n ) \H] } / E [I {k} (S n ) \H = u] f H («) du 

= C(n,k)ju k +\l-u) n - k f H (u)du 

C (n, k) J u k (l — u) n fjj(u)du 

A prior distribution for H 

The beta (r, s) distribution (see Appendix G (Section 17.7)), proves to be a "natural" choice for this 
purpose. Its range is the unit interval, and by proper choice of parameters r, s, the density function can 
be given a variety of forms (see Figures 1 (Figure 16.1) and 2 (Figure 16.2)). 
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Figure 16.1: The Beta(r,s) density for r = 2, s = 1, 2, 10. 
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Beta(r,s) density — r = 5 




Figure 16.2: The Beta(r,s) density for r = 5, s = 2, 5, 10. 



Its analysis is based on the integrals 

u (1 — u) du = — ; — 

For H ~ beta (r, s), the density is given by 



with r (a + 1) = ar (a) 



/sW 



r(r + «),,._! 



t r-1 (l - £) = A (r, s) t r ~\l - t) s 



< t < 1 



(16.19) 



(16.20) 



r(r)r(«) 

For r > 2, s > 2, fjj has a maximum at (r — 1) / (r + s — 2). For r, s positive integers, fjr is a polynomial 
on [0, 1], so that determination of the distribution function is easy. In any case, straightforward integration, 
using the integral formula above, shows 



E[H] 



r + s 



and Var [H] 



(r + s) (r + s + 1) 



(16.21) 



If the prior distribution for H is beta (r, s) , we may complete the determination of E [H\S n = k] as follows. 



E [H\S n = k] 



A (r, s) jl u k+1 {\ - «)"- V- X (l - u) 3 - 1 du jl u k+r {l - u) n+s ~ k - 1 du 



A (r, s) /q u k {l - u) u^Hl " ") du ft u^^^l - u) T 
T (r + k + l)T (n + s — k) T (r + s + n) k + r 



du 



(16.22) 



(16.23) 



r (r + s + n + 1) T (r + k)T (n + s — k) n + r + s 

We may adapt the analysis above to show that H is conditionally beta (r + k, s + n — k) , given S n = k 



E \l t (H)Is k x(S n j\ 
Fh\s (Ak) = V i\> {fc} \ " ;J where I t (H) = J [0 , t] (H) 



(16.24) 
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The analysis goes through exactly as for E\H\S n = k], except that H is replaced by I t (H). In the integral 
expression for the numerator, one factor u is replaced by I t (u). For H ~ beta (r, s), we get 



H\S 



(t\k) 



T{r- 



k-\-r- 



\i 



\ n+s— k- 



du 



Ih\s (u\k) du 



(16.25) 



r(r + fc)r(n+.s-fc) Jlt 

The integrand is the density for beta (r + k, n + s — k). 

Any prior information on the distribution for H can be utilized to select suitable r, s. If there is no prior 
information, we simply take r = 1, s = 1, which corresponds to 

H ~ uniform on (0, 1). The value is as likely to be in any subinterval of a given length as in any other 
of the same length. The information in the sample serves to modify the distribution for H, conditional upon 
that information. 

Example 16.1: Population proportion with a beta prior 

It is desired to estimate the portion of the student body which favors a proposed increase in the 
student blanket tax to fund the campus radio station. A sample of size n = 20 is taken. Fourteen 
respond in favor of the increase. Assuming prior ignorance (i.e., that H ~ beta (1,1)), what is 
the conditional distribution given S20 = 14? After the first sample is taken, a second sample of 
size n = 20 is taken, with thirteen favorable responses. Analysis is made using the conditional 
distribution for the first sample as the prior for the second. Make a new estimate of H. 



Conditional densities beta(15,7) and beta(28,14) 




Figure 16.3: Conditional densities for repeated sampling, Example 16.1 (Population proportion with a 
beta prior). 



SOLUTION 

For the first sample the parameters are r 
above, H is conditionally beta [k + r, n + s — k) 

mum at (r + fc— l)/(r + k + n + s — k — 2) = k/n. 
{r + k) /{r + s + n) = 15/22 w 0.6818. 



s = 1. According the treatment 

= (15, 7). The density has a maxi- 

The conditional expecation, however, is 
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For the second sample, with the conditional distribution as the new prior, we should expect more 
sharpening of the density about the new mean-square estimate. For the new sample, n = 20, k = 13, 
and the prior H ~ beta (15, 7). The new conditional distribution has parameters r* = 15 + 13 = 28 
and s* = 20 + 7 - 13 = 14. The density has a maximum at t = (28 - 1) / (28 + 14 - 2) = 27/40 = 
0.6750. The best estimate of H is 28/ (28 + 14) = 2/3. The conditonal densities in the two cases 
may be plotted with MATLAB (see Figure 1). 

t = 0:0.01:1; 
plot(t,beta(15,7,t), 'k- ' ,t ,beta(28, 14, t) , 'k— ') 

As expected, the maximum for the second is somewhat larger and occurs at a slightly smaller t, 
reflecting the smaller k. And the density in the second case shows less spread, resulting from the 
fact that prior information from the first sample is incorporated into the analysis of the second 
sample. 

The same result is obtained if the two samples are combined into one sample of size 40. 

It may be well to compare the result of Bayesian analysis with that for classical statistics. Since, in the 
latter, case prior information is not utilized, we make the comparison with the case of no prior knowledge 
(r = s = 1). For the classical case, the estimator for [i is the sample average; for the Bayesian case with 
beta prior, the estimate is the conditional expectation of H, given S n . 

If S n = k: Classical estimate = k/n Bayesian estimate = (k + 1) / (n + 2) (16.26) 

For large sample size n, these do not differ significantly. For small samples, the difference may be quite 
important. The Bayesian estimate is often referred to as the small sample estimate, although there is nothing 
in the Bayesian procedure which calls for small samples. In any event, the Bayesian estimate seems preferable 
for small samples, and it has the advantage that prior information may be utilized. The sampling procedure 
upgrades the prior distribution. 

The essential idea of the Bayesian approach is the view that an unknown parameter about which there 
is uncertainty is modeled as the value of a random variable. The name Bayesian comes from the role of 
Bayesian reversal in the analysis. 

The application of Bayesian analysis to the population proportion required Bayesian reversal in the case 
of discrete S n . We consider, next, this reversal process when all random variables are absolutely continuous. 
The Bayesian reversal for a joint absolutely continuous pair 

In the treatment above, we utilize the fact that the conditioning random variable S n is discrete. Suppose 
the pair {W, H} is jointly absolutely continuous, and fw\H (t\ u ) an d Jh (w) are specified. To determine 

E [H\W = t]= J uf H \w (u\t) du (16.27) 

we need fn\w ( u \t)- This requires a Bayesian reversal of the conditional densities. Now by definition 

Ih\w (u\t) = W f ' and f WH (t, u) = f w \ H (t\u) f H (u) (16.28) 

Jw (t) 

Since by the rule for determining the marginal density 

fw (t) = / fwH (t- u)du = j f w{H (t\u) f H (u) du (16.29) 

we have 

f ! u\ Mg M") ■fa ( M ) a riffiw .1 I u fw\H jt\u) f H ju) du 

f H \w (u\t) = r ' . . . . and E [H\W = t] = f ' (16.30) 

J Jw\h (t\u) J H (u) du J f w \ H (t\u) f H (u) du 
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Example 16.2: A Bayesian reversal 

Suppose H ~ exponential (A) and the X; are conditionally iid, exponential (u), given H = u. A 
sample of size n is taken. Put W = (Xi, X 2 , ■ ■ ■ , X n ), t = (£ 1: t 2 , ■ ■ ■ , t n ), and t* = t\ + t 2 + 
■ ■ ■ + t n . Determine the best mean-square estimate of H, given W = t. 
SOLUTION 

n 

fx t \H (*i|«) = ue~ uti so that f w \ H (t\u) = JJue""*' = u n e~ ut ' (16.31) 



2=1 



Hence 



f °° u n+l e ~uf \ e -Xu fin 

E\H\\\ =,\= J uf mw (u\t) ^^^^ (16-32) 



J oo U "+ 1 e-( A + t *)"d U _ („ + l)! (A + r)" +1 _ n+1 



where i* = V" ti (16.33) 



i=i 



/ oo u"e-( A +**)"dw (A-M*f +2 n! " (A + i*) 

16.2 Elements of Markov Sequences 2 
16.2,1 Elements of Markov Sequences 

Markov sequences (Markov chains) are often studied at a very elementary level, utilizing algebraic tools 
such as matrix analysis. In this section, we show that the fundamental Markov property is an expression of 
conditional independence of "past" and "future," given the "present." The essential Chapman-Kolmogorov 
equation is seen as a consequence of this conditional independence. In the usual time-homogeneous case 
with finite state space, the Chapman-Kolmogorov equation leads to the algebraic formulation that is widely 
studied at a variety of levels of mathematical sophistication. With the background laid, we only sketch some 
of the more common results. This should provide a probabilistic perspective for a more complete study of 
the algebraic analysis. 

Markov sequences 

We wish to model a system characterized by a sequence of states taken on at discrete instants which 
we call transition times. At each transition time, there is either a change to a new state or a renewal 
of the state immediately before the transition. Each state is maintained unchanged during the period 
or stage between transitions. At any transition time, the move to the next state is characterized by a 
conditional transition probability distribution. We suppose that the system is memoryless, in the sense that 
the transition probabilities are dependent upon the current state (and perhaps the period number), but not 
upon the manner in which that state was reached. The past influences the future only through the present. 
This is the Markov property, which we model in terms of conditional independence. 

For period i, the state is represented by a value of a random variable X;, whose value is one of the 
members of a set E, known as the state space. We consider only a finite state space and identify the states 
by integers from 1 to M. We thus have a sequence 

I N = {I„:nGN}, where N = {0, 1, 2, • • • } (16.34) 

We view an observation of the system as a composite trial. Each u> yields a sequence of states 
{Xo (oj) , X\ (to) , • • • } which is referred to as a realization of the sequence, or a trajectory. We suppose 
the system is evolving in time. At discrete instants of time t\, t 2 , • • • the system makes a transition from one 
state to the succeeding one (which may be the same). 



2 This content is available online at <http://cnx.Org/content/m23824/l.5/>. 
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Initial period: 


n = 0, t€ [0, h), 


state is Xq (uj); at tj the transition is to X\ (uj) 


Period one: 


n=l, t e [ti, t 2 ), 


state is Xi (uj); at t 2 the transition is to X 2 (uj) 








Period k: 


n = k, t e [tk, tk+i), 


state is X^ (to); at tk+i move to Xk+i (uj) 









Table 16.1 



The parameter n indicates the period te [t n , t n+ i). If the periods are of unit length, then t n = n. At 
t n +i, there is a transition from the state X n (uj) to the state X n+ i (uj) for the next period. To simplify 
writing, we adopt the following convention: 

U n = (X Q ,X ll --- ,X n )eE n U m , n = (X m ,---,X n ) and U n = (X n ,X n+ll - ■ ■) eE n (16.35) 

The random vector U n is called the past at n of the sequence Xjv and U" is the future at n. In order to 
capture the notion that the system is without memory, so that the future is affected by the present, but not 
by how the present is reached, we utilize the notion of conditional independence, given a random vector, in 
the following 

Definition. The sequence Xjv is Markov iff 

(M) {X n+1 ,U n }d\X n foralln>0 (16.36) 

Several conditions equivalent to the Markov condition (M) may be obtained with the aid of properties of 
conditional independence. We note first that (M) is equivalent to 



P(X n+1 = k\X n = j,U n . 
E, and Q c E™" 1 



E Q) = P (X n+ i = k\X n = j) for each n > 0, j, k e (16.37) 



The state in the next period is conditioned by the past only through the present state, and not by the 
manner in which the present state is reached. The statistics of the process are determined by the initial 
state probabilities and the transition probabilities 

P (X n+1 = k\X n = j) Vj,fceE, n>0 (16.38) 

The following examples exhibit a pattern which implies the Markov condition and which can be exploited 
to obtain the transition probabilities. 

Example 16.3: One-dimensional random walk 

An object starts at a given initial position. At discrete instants t\, t 2 , • • • the object moves a random 
distance along a line. The various moves are independent of each other. Let 

Yo be the initial position 

Yk be the amount the object moves at time t = tk {Yfe : 1 < k} iid 

X n = J]fe = o ^fc be the position after n moves. 

We note that X n+ i = g(X n ,Y n+ \). Since the position after the transition at t n +\ is affected by 
the past only by the value of the position X n and not by the sequence of positions which led to this 
position, it is reasonable to suppose that the process Xjv is Markov. We verify this below. 

Example 16.4: A class of branching processes 

Each member of a population is able to reproduce. For simplicity, we suppose that at certain 
discrete instants the entire next generation is produced. Some mechanism limits each generation 
to a maximum population of M members. Let 
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Zi n be the number propagated by the ith member of the nth generation. 

Zi n = indicates death and no offspring, Zi n = k indicates a net of k propagated by the ith 

member (either death and k offspring or survival and k — 1 offspring). 

The population in generation n + 1 is given by 

x n 
X n+1 = min{M, J2 Z in} (16-39) 

8 = 1 

We suppose the class {Z in : 1 < i < M, < n) is iid. Let Y n+1 = (Z ln , Z 2n , • • • , Zmu)- Then 
{Y n+ i, U n } is independent. It seems reasonable to suppose the sequence Xjv is Markov. 

Example 16.5: An inventory problem 

A certain item is stocked according to an (ra, M) inventory policy, as follows: 

• If stock at the end of a period is less than m, order up to M. 

• If stock at the end of a period is m or greater, do not order. 

Let Xo be the initial stock, and X n be the stock at the end of the nth period (before restocking), 
and let D n be the demand during the nth period. Then for n > 0, 

max{M -£> n+ i, 0} if < X n < m 
X n+1 = { _ = g(X n ,D n+1 ) (16.40) 

max{X n — D n+ i, 0} if m < X n 

If we suppose {D n : 1 < n} is independent, then {D n+ i, U n } is independent for each n > 0, and 
the Markov condition seems to be indicated. 

Remark. In this case, the actual transition takes place throughout the period. However, for purposes of 
analysis, we examine the state only at the end of the period (before restocking). Thus, the transitions are 
dispersed in time, but the observations are at discrete instants. 

Example 16.6: Remaining lifetime 

A piece of equipment has a lifetime which is an integral number of units of time. When a unit 
fails, it is replaced immediately with another unit of the same type. Suppose 

• X n is the remaining lifetime of the unit in service at time n 

• Y n+ i is the lifetime of the unit installed at time n, with {Y n : 1 < n} iid 

X n -1 if X n > 1 
Then X n+1 = { ~ =g{X n , Y n+l ) (16.41) 

Y n+1 - 1 if X n = 

Remark. Each of these four examples exhibits the pattern 

i. {Xo, Y n : 1 < n} is independent 
ii. X n+ i = g n+1 (X n , Y n+ i) , n > 

We now verify the Markov condition and obtain a method for determining the transition probabilities. 
A pattern yielding Markov sequences 

Suppose {Y n : < n} is independent (call these the driving random variables). Set 

*o = <?o(^o) and X n+1 = g n+1 (X n , Y n+1 ) Vn>0 (16.42) 

Then 
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a. Xjv is Markov 

b. P (X n+ i e Q\X n = u) = P [g n +i (u, Y n + 1) € Q] for all n, u, and any Borel set Q. 

VERIFICATION 

a. It is apparent that if Y ,Yi,--- ,Y n are known, then U n is known. Thus U n = h n (Yq, Yl, ■ ■ ■ ,Y n ) 
, which ensures each pair {Y n+ i,U n } is independent. By property (CI13) ("(CI13)", p. 603), with 
X = Y n+ i,Y = X n , and Z = U n -\, we have 

{r n+1 ,C/„_i}ci|X„ (16.43) 

Since X n+1 = g n+1 (Y n+1 ,X n ) and U n = h n (X n , C/ n _i), property (CI9) ("(CI9) ", p. 602) ensures 

{X n+1 , U n }ci\X n \/n > (16.44) 

which is the Markov property. 

b. P(X n+1 e Q\X n = u) = E{I Q [g n+1 (X n ,Y n+1 )] \X n = u} a.s. = E{I Q [g n+1 (u,Y n+1 )]}a.s. [P x ] by 
(CElOb) ("(CE10)", p. 601) = P[g n+1 (u,Y n+1 ) e Q] by (Ela) ("(Ela) ", p. 599) 

— □ 

The application of this proposition, below, to the previous examples shows that the transition probabilities 
are invariant with n. This case is important enough to warrant separate classification. 

Definition. If P (X n+ i e Q\X n = u) is invariant with n, for all Borel sets Q, all u € E, the Markov 
process Xjv is said to be homogeneous. 

As a matter of fact, this is the only case usually treated in elementary texts. In this regard, we note the 
following special case of the proposition above. 

Homogenous Markov sequences 

If {Y n : 1 < n} is iid and g n +i = g for all n, then the process is a homogeneous Markov process, and 

P (X n+1 e Q\X n = u) = P[g (u, Y n+1 ) e Q] , invariant with n (16.45) 

— □ 

Remark. 

In the homogeneous case, the transition probabilities are invariant with n. In this case, we write 

P (X n+ i = j\X n = i) = p(i,j) or pij (invariant with n (16.46) 

These are called the (one-step) transition probabilities. 

The transition probabilities may be arranged in a matrix P called the transition probability matrix, 
usually referred to as the transition matrix, 

P = [p(i,j)} (16.47) 

The element p (i,j) on row i and column j is the probability P (X n+ i = j\X n = i). Thus, the elements on 
the ith row constitute the conditional distribution for X n+ \, given X n = i. The transition matrix thus has 
the property that each row sums to one. Such a matrix is called a stochastic matrix. We return to the 
examples. From the propositions on transition probabilities, it is apparent that each is Markov. Since the 
function g is the same for all n and the driving random variables corresponding to the Y,- form an iid class, 
the sequences must be homogeneous. We may utilize part (b) of the propositions to obtain the one-step 
transition probabilities. 

Example 16.7: Random walk continued 

g n (u, Y n+ i) = u + Y n+ i, so that g n is invariant with n. Since {Y n : 1 < n) is iid, 

P (X n+1 = k\X n = j) = P (j + Y = k) = P (Y = k - j) = p k _j where Pk = P (Y = k) (16.48) 
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Example 16.8: Branching process continued 

g(j, Y n+1 ) = min{M, £-=i Z in ) and E = {0, 1, • • • , M}. If {Z in : 1 < i < M, 1 < n} is iid, 
then 

i 
Wj„ = V^ Z in ensures {Wj„ : 1 < n} is iid for each jeE (16.49) 

We thus have 

P (W jn = k) for < k < M 
P(X n+1 = kX n =j) = { 3 ~ 0<j<M (16.50) 

P (Wjn >M) for k > M 

With the aid of moment generating functions, one may determine distributions for 

W l = Z 1 , W 2 = Z l + Z 2 , -.., W M = Z 1 + --- + Z M (16.51) 

These calculations are implemented in an m-procedure called branchp. We simply need the distri- 
bution for the iid Zi n . 

7, file branchp. m 
7, Calculates transition matrix for a simple branching 
7. process with specified maximum population. 

disp('Do not forget zero probabilities for missing values of Z') 
PZ = input ('Enter PROBABILITIES for individuals '); 
M = input ('Enter maximum allowable population '); 
mz = length (PZ) - 1; 
EZ = dot(0:mz,PZ) ; 

disp(['The average individual propagation is ' ,num2str (EZ) ,] ) 
P = zeros(M+l,M+l) ; 
Z = zeros (M, M*mz+1) ; 
k = 0:M*mz; 
a = min(M,k) ; 
z = 1; 
P(1,D = 1; 
for i = 1:M 7. Operation similar to genD 

z = conv(PZ,z) ; 

Z(i,l:i*mz+1) = z; 

[t,p] = csort(a,Z(i, :)) ; 

P(i+1,:) = p; 
end 

disp('The transition matrix is P') 
disp('To study the evolution of the process, call for branchdbn') 

PZ = 0.01* [15 45 25 10 5]; '/. Probability distribution for individuals 

branchp '/, Call for procedure 

Do not forget zero probabilities for missing values of Z 

Enter PROBABILITIES for individuals PZ 

Enter maximum allowable population 10 

The average individual propagation is 1.45 

The transition matrix is P 

To study the evolution of the process, call for branchdbn 
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disp(P) 




'/. 


Optional 


display of 


generated 


P 




Columns 1 


through 7 














1.0000 






















0.1500 


0.4500 


0.2500 


0.1000 


0.0500 










0.0225 


0.1350 


0.2775 


0.2550 


0.1675 


0.0950 





,0350 


0.0034 


0.0304 


0.1080 


0.1991 


0.2239 


0.1879 


0. 


.1293 


0.0005 


0.0061 


0.0307 


0.0864 


0.1534 


0.1910 


0. 


.1852 


0.0001 


0.0011 


0.0075 


0.0284 


0.0702 


0.1227 


0. 


.1623 


0.0000 


0.0002 


0.0017 


0.0079 


0.0253 


0.0579 


0. 


,1003 


0.0000 


0.0000 


0.0003 


0.0020 


0.0078 


0.0222 





,0483 


0.0000 


0.0000 


0.0001 


0.0005 


0.0021 


0.0074 


0, 


,0194 


0.0000 


0.0000 


0.0000 


0.0001 


0.0005 


0.0022 





,0068 


0.0000 


0.0000 


0.0000 


0.0000 


0.0001 


0.0006 





,0022 


Columns 8 


through 11 






















































0.0100 


0.0025 
















0.0705 


0.0315 


0.0119 


0.0043 










0.1481 


0.0987 


0.0559 


. 0440 










0.1730 


0.1545 


0.1179 


0.1625 










0.1381 


0.1574 


0.1528 


0.3585 










0.0832 


0.1179 


0.1412 


0.5771 










0.0406 


0.0698 


0.1010 


0.7591 










0.0169 


0.0345 


0.0590 


0.8799 










0.0062 


0.0147 


0.0294 


0.9468 











Note that p(0, 0) = 1. If the population ever reaches zero, it is extinct and no more births can 
occur. Also, if the maximum population (10 in this case) is reached, there is a high probability of 
returning to that value and very small probability of becoming extinct (reaching zero state) . 

Example 16.9: Inventory problem (continued) 

In this case, 



g(j,D n+1 ) = { 



max{M — D n+ i, 0} for < j < m 
max{j — D n+ i, 0} for in < j < M 



Numerical example 



m = 1 M = 3 D n is Poisson (1) 
To simplify writing, use D for D n . Because of the invariance with n, set 

P (X n+l = k\X n =j)= P (j, k) = P[g (j, D n+l ) = k] 

The various cases yield 
g(0, D) = max{3-D, 0} 

g (0, D) = iff D > 3 implies p (0, 0) = P (D > 3) 
g (0, D) = 1 iff D = 2 implies p (0, 1) = P (D = 2) 
g (0, D) = 2 iff D = 1 implies p (0, 2) = P (D = 1) 
g (0, D) = 3 iff D = implies p (0, 3) = P (D = 0) 

g(l, D) = max{\- D, 0} 

g(l, D) = iff D> 1 implies p(l, 0) = P (D > 1) 



(16.52) 

(16.53) 
(16.54) 
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g(l, D) = 1 iff D = implies p(l, 
g (1, £)) = 2, 3 is impossible 



1)=P(D = 0) 



g{2, D) = max{2- D, 0} 

g (2, £>) = iff D > 2 implies p (2, 0) = P (D > 2) 
g (2, £>) = 1 iff Z) = 1 implies p (2, 1) = P (D = 1) 
5 (2, D) = 2 iff D = implies p (2, 2) = P (D = 0) 
5 (2, £>) = 3 is impossible 



g (3, D) = maa;{3 - D, 0} = 5 (0, D) so that p (3, fc) = p (0, k) 

The various probabilities for D may be obtained from a table (or may be calculated easily with 
cpoisson) to give the transition probability matrix 



0.0803 0.1839 0.3679 0.3679 

0.6321 0.3679 

0.2642 0.3679 0.3679 

0.0803 0.1839 0.3679 0.3679 



(16.55) 



The calculations are carried out "by hand" in this case, to exhibit the nature of the calculations. 
This is a standard problem in inventory theory, involving costs and rewards. An m-procedure 
inventoryl has been written to implement the function g. 

'/, file inventoryl .m 
'/. Version of 1/27/97 

'/, Data for transition probability calculations 
'/, for (m,M) inventory policy 

M = input ('Enter value M of maximum stock '); 
m = input ('Enter value m of reorder point '); 
Y = input ('Enter row vector of demand values '); 
PY = input('Enter demand probabilities '); 
states = 0:M; 
ms = length(states) ; 
my = length (Y) ; 

'/, Calculations for determining P 
[y,s] = meshgrid(Y, states) ; 

T = max(0,M-y) .*(s < m) + max(0, s-y) . * (s >= m) ; 
P = zeros(ms,ms) ; 
for i = l:ms 

[a,b] = meshgrid(T(i, :), states) ; 

P(i, :) = PY*(a==b)'; 
end 
P 



We consider the case M = 5, the reorder point m 
the Poisson distribution with values up to 20. 

inventoryl 
Enter value M of maximum stock 5 
Enter value m of reorder point 3 
Enter row vector of demand values 0:20 
Enter demand probabilities ipoisson(3,0 : 20) 
P = 



3, and demand is Poisson (3). We approximate 



7, Maximum stock 

'/, Reorder point 

'/, Truncated set of demand values 

°/. Demand probabilities 
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0.1494 
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0.3528 


0.2240 


0.2240 


0.1494 
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0.1680 
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Example 16.10: Remaining lifetime (continued) 

g(0,Y)=Y- 1, so that p (0, k) = P (Y - 1 = k) = P (Y = k + 1) 
9 (j, Y)= j -1 f or 3 > 1> so that p (j, k) = Sj- 1>k for j > 1 
The resulting transition probability matrix is 



Pi P2 P3 
1 
1 



(16.56) 



The matrix is an infinite matrix, unless Y is simple. If the range of Y is {1, 2, • • • , M} then the 
state space E is {0, 1, • • • , M — 1}. 

Various properties of conditional independence, particularly (CI9) ("(CI9) ", p. 602), (CI10) ("(CI10)", p. 
603), and (CI12) ("(CI12)", p. 603), may be used to establish the following. The immediate future X n+ i may 
be replaced by any finite futureU n , n +p and the present X n may be replaced by any extended piesentU m ^ n . 
Some results of abstract measure theory show that the finite future U n , n +p may be replaced by the entire 
future LP. Thus, we may assert 

Extended Markov property 

Xjv is Markov iff 



(M* {U n ,U m }ci\U m . n \/0<m<n 

- □ 
The Chapman-Kolmogorov equation and the transition matrix 

As a special case of the extended Markov property, we have 



(16.57) 



Setting g (U n+k ,X n 



{U n+k , U n } ci \X n+k for all n > 0, k, > 1 
,, ; ; = X n+k+m andh{U n ,X n+k ) = X n in (CI9) ("(CI9) ", p. 602), we get 

{X n+k+m , X„}ci \X n+k for all n > 0, k, m > 1 



(16.58) 



(16.59) 



By the iterated conditioning rule (CI9) ("(CI8) ", p. 602) for conditional independence, it follows that 



(CK) E[g(X n 



-\-k-\-rrij 



\X n ] = E{E [g {X n+k+m ) \X n+k ] \X n } V n > 0, k, m > 1 



(16.60) 



This is the Chapman-Kolmogorov equation, which plays a central role in the study of Markov sequences. 
For a discrete state space E, with 



P (X n = j\X m = i) = p m>n (i, j) 



this equation takes the form 



(CK') p m ,q(i, k) = ^2p m ,n(i, j)Pn,q(j, k) <m <n < q 



(16.61) 
(16.62) 
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To see that this is so, consider 

P (X q = k\X m = i) = E [I {k} (X q ) \X m = i}= E{E [I {k} (X q ) \X n ] \X m = i} (16.63) 

= ^2 E [ J {k} ( X g) \ X n =j]Pm,n(i, j) = ^Pn,q(j, k)p m . n {i, j) (16.64) 

3 3 

Homogeneous case 

For this case, we may put (CK ') in a useful matrix form. The conditional probabilities p m of the form 

p m (i, k) = P (X n+m = k\X n = i) invariant in n (16.65) 

are known as the m-step transition probabilities. The Chapman-Kolmogorov equation in this case becomes 



{CK") p m+n (i, k) = Y J P m (i, J) P n (J, k) V i, j 6 E 



(16.66) 



In terms of the m-step transition matrix p( m ) = [p m (i,k)], this set of sums is equivalent to the matrix 
product 



Now 



(CK") p( m +") = p( m )p(«) 



p(2) = p(l)p(l) = pp = p2 p(3) = p(2)p(l) =p 3 etc _ 



(16.67) 
(16.68) 



A simple inductive argument based on (CK") establishes 
The product rule for transition matrices 
The m-step probability matrix p( m ) = P m , the mth power of the transition matrix P 

— □ 

Example 16.11: The inventory problem (continued) 

For the inventory problem in Example 16.9 (Inventory problem (continued)), the three-step tran- 
sition probability matrix P^ 3 ) is obtained by raising P to the third power to get 



p(3) = p3 



0.2930 0.2917 0.2629 0.1524 

0.2619 0.2730 0.2753 0.1898 

0.2993 0.2854 0.2504 0.1649 

0.2930 0.2917 0.2629 0.1524 



(16.69) 



□ 



We consider next the state probabilities for the various stages. That is, we examine the distributions for the 
various X n , letting p k (ri) = P (X n = k) for each k G E. To simplify writing, we consider a finite state space 
E = {1, • • • , M}. We use n (n) for the row matrix 



n (n) = [pi (n) p 2 (ri) ■■ ■ p M (ri)] 



(16.70) 



As a consequence of the product rule, we have 

Probability distributions for any period 

For a homogeneous Markov sequence, the distribution for any X n is determined by the initial distribution 
(i.e., for X ) and the transition probability matrix P. 

VERIFICATION 
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Suppose the homogeneous sequence Xjv has finite state-space E = {1,2,- •• , M}. For any n > 0, let 
pj {n) = P (X n = j) for each j e E. Put 

n (n) = [pi (n) p 2 (ri) ■■ ■ p M (n)] (16.71) 

Then 

7r (0) = the initial probability distribution 
7T (1) =tt(0)P 



tt (n) = tt (n - 1) P = tt (0) P (n) = 7T (0) P" = the nth-period distribution 

The last expression is an immediate consequence of the product rule. 

— □ 

Example 16.12: Inventory problem (continued) 

In the inventory system for Examples 3 (Example 16.5: An inventory problem), 7 (Example 16.9: 
Inventory problem (continued)) and 9 (Example 16.11: The inventory problem (continued)), sup- 
pose the initial stock is M = 3. This means that 

71- (0) = [0 1] (16.72) 

The product of it (0) and P 3 is the fourth row of P 3 , so that the distribution for Xg is 

7r(3) = [po(3) Pi (3) p 2 (3) p 3 (3)] = [0.2930 0.2917 0.2629 0.1524] (16.73) 

Thus, given a stock of M = 3 at startup, the probability is 0.2917 that X$ = 1. This is the 
probability of one unit in stock at the end of period number three. 

Remarks 

• A similar treatment shows that for the nonhomogeneous case the distribution at any stage is determined 
by the initial distribution and the class of one-step transition matrices. In the nonhomogeneous case, 
transition probabilities p n ,n+i (i, j) depend on the stage n. 

• A discrete-parameter Markov process, or Markov sequence, is characterized by the fact that each 
member X n+ \ of the sequence is conditioned by the value of the previous member of the sequence. 
This one-step stochastic linkage has made it customary to refer to a Markov sequence as a Markov chain. 
In the discrete-parameter Markov case, we use the terms process, sequence, or chain interchangeably. 

The transition diagram and the transition matrix 

The previous examples suggest that a Markov chain is a dynamic system, evolving in time. On the 
other hand, the stochastic behavior of a homogeneous chain is determined completely by the probability 
distribution for the initial state and the one-step transition probabilities p (i, j) as presented in the transition 
matrix P. The time-invariant transition matrix may convey a static impression of the system. However, a 
simple geometric representation, known as the transition diagram, makes it possible to link the unchanging 
structure, represented by the transition matrix, with the dynamics of the evolving system behavior. 

Definition. A transition diagram for a homogeneous Markov chain is a linear graph with one node for 
each state and one directed edge for each possible one-step transition between states (nodes). 

We ignore, as essentially impossible, any transition which has zero transition probability. Thus, the edges 
on the diagram correspond to positive one-step transition probabilities between the nodes connected. Since 
for some pair (i, j) of states, we may have p (i, j) > but p (j, i) = 0, we may have a connecting edge between 
two nodes in one direction, but none in the other. The system can be viewed as an object jumping from 
state to state (node to node) at the successive transition times. As we follow the trajectory of this object, 
we achieve a sense of the evolution of the system. 
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Example 16.13: Transition diagram for inventory example 

Consider, again, the transition matrix P for the inventory problem (rounded to three decimals). 



0.080 0.184 0.368 0.368 

0.632 0.368 

0.264 0.368 0.368 

0.080 0.184 0.368 0.368 



(16.74) 



Figure 1 shows the transition diagram for this system. At each node corresponding to one of the 
possible states, the state value is shown. In this example, the state value is one less than the state 
number. For convenience, we refer to the node for state k+ 1, which has state value k, as node k. If 
the state value is zero, there are four possibilities: remain in that condition with probability 0.080; 
move to node 1 with probability 0.184; move to node 2 with probability 0.368; or move to node 3 
with probability 0.368. These are represented by the "self loop" and a directed edge to each of the 
nodes representing states. Each of these directed edges is marked with the (conditional) transition 
probability. On the other hand, probabilities of reaching state value from each of the others is 
represented by directed edges into the node for state value 0. A similar situation holds for each 
other node. Note that the probabilities on edges leaving a node (including a self loop) must total 
to one, since these correspond to the transition probability distribution from that node. There is 
no directed edge from the node 2 to node 3, since the probability of a transition from value 2 to 
value 3 is zero. Similary, there is no directed edge from node 1 to either node 2 or node 3. 




0.184 



Figure 16.4: Transition diagram for the inventory system of Example 16.13 ( Transition diagram for 
inventory example). 
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There is a one-one relation between the transition diagram and the transition matrix P. The transition 
diagram not only aids in visualizing the dynamic evolution of a chain, but also displays certain structural 
properties. Often a chain may be decomposed usefully into subchains. Questions of communication and 
recurrence may be answered in terms of the transition diagram. Some subsets of states are essentially 
closed, in the sense that if the system arrives at any one state in the subset it can never reach a state outside 
the subset. Periodicities can sometimes be seen, although it is usually easier to use the diagram to show 
that periodicities cannot occur. 

Classification of states 

Many important characteristics of a Markov chain can be studied by considering the number of visits to 
an arbitrarily chosen, but fixed, state. 

Definition. For a fixed state j, let 

7\ = the time (stage number) of the first visit to state j (after the initial period). 

Pk (i,j) = P {Ti = k\Xo = i), the probability of reaching state j for the first time from state i in k 

steps. 

P ihj) = P {Ti < oo|Xo = i) = SfeLi Pk (*; j), the probability of ever reaching state j from state i. 

A number of important theorems may be developed for Ft and F, although we do not develop them in this 
treatment. We simply quote them as needed. An important classification of states is made in terms of F. 

Definition. State j is said to be transient iff F (j, j) < 1, 

and is said to be recurrent iff F (j, j) = 1. 

Remark. If the state space E is infinite, recurrent states fall into one of two subclasses: positive or null. 
Only the positive case is common in the infinite case, and that is the only possible case for systems with 
finite state space. 

Sometimes there is a regularity in the structure of a Markov sequence that results in periodicities. 

Definition. For state j, let 

S = greatest common denominator of {n : p n (j, j) > 0} (16.75) 

If 5 > 1, then state j is periodic with period 5; otherwise, state j is aperiodic. 

Usually if there are any self loops in the transition diagram (positive probabilities on the diagonal of the 
transition matrix P) the system is aperiodic. Unless stated otherwise, we limit consideration to the aperiodic 
case. 

Definition. A state j is called ergodic iff it is positive, recurrent, and aperiodic. 

It is called absorbing iff p(j, j) = 1. 

A recurrent state is one to which the system eventually returns, hence is visited an infinity of times. If 
it is absorbing, then once it is reached it returns each step (i.e., never leaves). 

An arrow notation is used to indicate important relations between states. 

Definition. We say 

State i reaches j, denoted i — » j, iff p n (i, j) > for some n > 0. 

States i and j communicate, denoted i <-* j iff both i reaches j and j reaches i. 

By including j reaches j in all cases, the relation ^ is an equivalence relation (i.e., is reflexive, transitive, 
and idempotent). With this relationship, we can define important classes. 

Definition. A class of states is communicating iff every state in the class may be reached from every 
other state in the class (i.e. every pair communicates). A class is closed if no state outside the class can be 
reached from within the class. 

The following important conditions are intuitive and may be established rigorously: 

i <-> j implies i is recurrent iff j is recurrent 

i — » j and i recurrent implies i <-> j 

i — » j and i recurrent implies j recurrent 
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Limit theorems for finite state space sequences 

The following propositions may be established for Markov sequences with finite state space: 

• There are no null states, and not all states are transient. 

• If a class of states is irreducible (i.e., has no proper closed subsets), then 

- All states are recurrent 

- All states are aperiodic or all are periodic with the same period. 

- If a class C is closed, irreducible, and i is a transient state (necessarily not in C), 

then F (i, j) = F (i, k) for all j, keC. 

A limit theorem 

If the states in a Markov chain are ergodic (i.e., positive, recurrent, aperiodic), then 



M 



M 



limp™ (i, j) = TTj > ^ Kj = 1 7Tj = ^ ITiP (i, j) 



i=i 



If, as above, we let 



7r (n) = [pi (n) pi {n) ■ ■ ■ pu (n)] so that ir (n) = it (0) P" 
the result above may be written 



7T (n) = IT (0) P" -+ 7T (0) P 



where 



7Tl 7T 2 

ir 1 7r 2 






(16.76) 

(16.77) 
(16.78) 



(16.79) 



TTl ""2 • • • K m 

Each row of Po = lim~P n is the long run distribution 7r = limir (n). 

n n 

Definition. A distribution is stationary iff 

7T = ttP (16.80) 

The result above may be stated by saying that the long-run distribution is the stationary distribution. A 
generating function analysis shows the convergence is exponential in the following sense 



|P"-Po|<«|A| n 
where |A| is the largest absolute value of the eigenvalues for P other than A = 1. 
Example 16.14: The long run distribution for the inventory example 



(16.81) 



We use MATLAB to check the eigenvalues for the transition probability P and to obtain increasing 
powers of P. The convergence process is readily evident. 





P = 












0.0803 


0.1839 





3679 


0.3679 




0.6321 


0.3679 












0.2642 


0.3679 





3679 







0.0803 


0.1839 





3679 


0.3679 


E 


= abs(eig 


(P)) 








E 


= 
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C( 


1.0000 






0.2602 






0.2602 






0.0000 






format long 






N = E(2) ."[48 12] 






N = 0.00458242348096 


0.00002099860496 




> P4 = P-4 






P4 = 






0.28958568915950 


0.28593792666752 





0.28156644866011 


0.28479107531968 





0.28385952806702 


0.28250048636032 





0.28958568915950 


0.28593792666752 





> P8 = P~8 






P8 = 
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0.28580046500309 
0.28577030590344 
0.28581491438224 
0.28580046500309 



0.28471421248816 
0.28469190218618 
0.28471028095839 
0.28471421248816 



0.00000009622450 



,26059678211310 
,26746979455342 
,26288737107246 
,26059678211310 



0.26315895715219 
0.26316681807503 
0.26314057837998 
0.26315895715219 



0.16387960205989 
0.16617268146679 
0.17075261450021 
0.16387960205989 



0.16632636535655 
0.16637097383535 
0.16633422627939 
0.16632636535655 



0.28470680858266 
0.28470680714781 
0.28470687626748 
0.28470680858266 



> P12 = P~12 
P12 = 

0.28579560683438 

0.28579574073314 

0.28579574360207 

0.28579560683438 
3> error4 = max(max(abs(P~16 - P4) ) ) 
error4 = 0.00441148012334 

> error8 = max(max(abs(P~16 - P8))) 
error8 = 2. 984007206519035e-05 

> errorl2 = max(max(abs(P~16 - P12))) 
errorl2 = 1 . 005660185959822e-07 



0.26315641543927 0.16634116914369 
0.26315628010643 0.16634117201261 
0.26315634631961 0.16634103381085 
0.26315641543927 0.16634116914369 

°/„ Use P~16 for P_0 

'/. Compare with . 0045824 . . . 

'/„ Compare with 0.00002099 

'/. Compare with 0.00000009622450 



The convergence process is clear, and the agreement with the error is close to the predicted. We 
have not determined the factor a, and we have approximated the long run matrix Po with P 16 . This 
exhibits a practical criterion for sufficient convergence. If the rows of P n agree within acceptable 
precision, then n is sufficiently large. For example, if we consider agreement to four decimal places 
sufficient, then 

P10 = p-io 



P10 



0.2858 


0.2847 


0.2632 


0. 


,1663 


0.2858 


0.2847 


0.2632 


0. 


,1663 


0.2858 


0.2847 


0.2632 


0. 


,1663 


0.2858 


0.2847 


0.2632 


0, 


,1663 


shows that 


n = 10 is c 


mite sufficient 
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16.2,2 Simulation of finite homogeneous Markov sequences 

In the section, "The Quantile Function" (Section 10.3), the quantile function is used with a random number 
generator to obtain a simple random sample from a given population distribution. In this section, we adapt 
that procedure to the problem of simulating a trajectory for a homogeneous Markov sequences with finite 
state space. 

Elements and terminology 

1. States and state numbers. We suppose there are m states, usually carrying a numerical value. For 
purposes of analysis and simulation, we number the states 1 through m. Computation is carried out 
with state numbers; if desired, these can be translated into the actual state values after computation 
is completed. 

2. Stages, transitions, period numbers, trajectories and time. We use the term stage and period 
interchangeably. It is customary to number the periods or stages beginning with zero for the initial 
stage. The period number is the number of transitions to reach that stage from the initial one. Zero 
transitions are required to reach the original stage (period zero), one transition to reach the next 
(period one), two transitions to reach period two, etc. We call the sequence of states encountered as 
the system evolves a trajectory or a chain. The terms "sample path" or "realization of the process" are 
also used in the literature. Now if the periods are of equal time length, the number of transitions is a 
measure of the elapsed time since the chain originated. We find it convenient to refer to time in this 
fashion. At time k the chain has reached the period numbered k. The trajectory is k + 1 stages long, 
so time or period number is one less than the number of stages. 

3. The transition matrix and the transition distributions. For each state, there is a conditional 
transition probability distribution for the next state. These are arranged in a transition matrix. The 
ith row consists of the transition distribution for selecting the next-period state when the current state 
number is i. The transition matrix P thus has nonnegative elements, with each row summing to one. 
Such a matrix is known as a stochastic matrix. 

The fundamental simulation strategy 

1. A fundamental strategy for sampling from a given population distribution is developed in the unit on 
the Quantile Function. If Q is the quantile function for the population distribution and U is a random 
variable distributed uniformly on the interval [0,1], then X = Q (U) has the desired distribution. 
To obtain a sample from the uniform distribution use a random number generator. This sample is 
"transformed" by the quantile function into a sample from the desired distribution. 

2. For a homogeneous chain, if we are in state k, we have a distribution for selecting the next state. If we 
use the quantile function for that distribution and a number produced by a random number generator, 
we make a selection of the next state based on that distribution. A succession of these choices, with 
the selection of the next state made in each case from the distribution for the current state, constitutes 
a valid simulation of a trajectory. 

Arrival times and recurrence times 

The basic simulation produces one or more trajectories of a specified length. Sometimes we are interested 
in continuing until first arrival at (or visit to) a specific target state or any one of a set of target states. The 
time (in transitions) to reach a target state is one less than the number of stages in the trajectory which 
begins with the initial state and ends with the target state reached. 



If the initial state is not in the target set, we speak of the arrival time. 

If the initial state is in the target set, the arrival time would be zero. In this case, we do not stop at 

zero but continue until the next visit to a target state (possibly the same as the initial state). We call 

the number of transitions in this case the recurrence time. 

In some instances, it may be desirable to know the time to complete visits to a prescribed number of 

the target states. Again there is a choice of treatment in the case the initial set is in the target set. 
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Data files 

For use of MATLAB in simulation, we find it convenient to organize the appropriate data in an m-file. 

• In every case, we need the transition matrix P. Its size indicates the number of states (say by the 
length of any row or column) . 

• If the states are to have values other than the state numbers, these may be included in the data file, 
although they may be added later, in response to a prompt. 

• If long trajectories are to be produced, it may be desirable to determine the fraction of times each state 
is realized. A comparison with the long-run probabilities for the chain may be of interest. In this case, 
the data file may contain the long-run probability distribution. Usually, this is obtained by taking one 
row of a sufficiently large power of the transition matrix. This operation may be performed after the 
data file is called for but before the simulation procedure begins. 

An example data file used to illustrate the various procedures is shown below. These data were generated 
artificially and have no obvious interpretations in terms of a specific systems to be modeled. However, they 
are sufficiently complex to provide nontrivial illustrations of the simulation procedures. 

'/, file markovpl.m 
7, Artificial data for a Markov chain, used to 
'/, illustrate the operation of the simulation procedures. 
P = [0.050 0.011 0.155 0.155 0.213 0.087 0.119 0.190 0.008 0.012 

0.103 0.131 0.002 0.075 0.013 0.081 0.134 0.115 0.181 0.165 

0.103 0.018 0.128 0.081 0.137 0.180 0.149 0.051 0.009 0.144 

0.051 0.098 0.118 0.154 0.057 0.039 0.153 0.112 0.117 0.101 

0.016 0.143 0.200 0.062 0.099 0.175 0.108 0.054 0.062 0.081 

0.029 0.085 0.156 0.158 0.011 0.156 0.088 0.090 0.055 0.172 

0.110 0.059 0.020 0.212 0.016 0.113 0.086 0.062 0.204 0.118 

0.084 0.171 0.009 0.138 0.140 0.150 0.023 0.003 0.125 0.157 

0.105 0.123 0.121 0.167 0.149 0.040 0.051 0.059 0.086 0.099 

0.192 0.093 0.191 0.061 0.094 0.123 0.106 0.065 0.040 0.035]; 
states = 10:3:37; 
PI = [0.0849 0.0905 0.1125 0.1268 0.0883 0.1141 ... 

0.1049 0.0806 0.0881 0.1093]; '/. Long-run distribution 

The largest absolute value of the eigenvalues (other than one) is 0.1716. Since 0.1716 16 w 5.6- 10 , we take 
any row of P 16 as the long-run probabilities. These are included in the matrix PI in the m-file, above. The 
examples for the various procedures below use this set of artificial data, since the purpose is to illustrate the 
operation of the procedures. 

The setup and the generating m-procedures 

The m-procedure chainset sets up for simulation of Markov chains. It prompts for input of the transition 
matrix P, the states (if different from the state numbers) , the long-run distribution (if available) , and the set 
of target states if it is desired to obtain arrival or recurrence times. The procedure determines the number 
of states from the size of P and calculates the information needed for the quantile function. It then prompts 
for a call for one of the generating procedures. 

The m-procedure mchain, as do the other generating procedures below, assumes chainset has been run, 
so that commonly used data are available in appropriate form. The procedure prompts for the number of 
stages (length of the trajectory to be formed) and for the initial state. When the trajectory is produced, the 
various states in the trajectory and the fraction or relative frequency of each is displayed. If the long-run 
distribution has been supplied by chainset, this distribution is included for comparison. In the examples 
below, we reset the random number generator (set the "seed" to zero) for purposes of comparison. However, 
in practice, it may be desirable to make several runs without resetting the seed, to allow greater effective 
"randomness." 
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Example 16.15 

markovpl '/, Call for data 

chainset % Call for setup procedure 

Enter the transition matrix P 

Enter the states if not l:ms states % Enter the states 
States are 

1 10 

2 13 

3 16 

4 19 

5 22 

6 25 

7 28 

8 31 

9 34 
10 37 

Enter the long-run probabilities PI '/, Enter the long-run distribution 

Enter the set of target states [16 22 25] '/, Not used with mchain 

Call for for appropriate chain generating procedure 

rand(' seed' ,0) 

mchain '/, Call for generating procedure 

Enter the number n of stages 10000 '/, Note the trajectory length 

Enter the initial state 16 

°/. Statistics on the trajectory 



State 


Frac 




P0 


10.0000 


0.0812 


0. 


,0849 


13.0000 


0.0952 





,0905 


16.0000 


0.1106 


0. 


,1125 


19.0000 


0.1226 


0, 


,1268 


22.0000 


0.0880 


0. 


,0883 


25.0000 


0.1180 





.1141 


28.0000 


0.1034 


0. 


,1049 


31.0000 


0.0814 


0. 


.0806 


34.0000 


0.0849 


0. 


,0881 


37.0000 


0.1147 





,1093 



To view the first part of the trajectory of states, call for TR 
disp(TR') 

0123456789 10 

16 16 10 28 34 37 16 25 37 10 13 

The fact that the fractions or relative frequencies approximate the long-run probabilities is an 
expression of a fundamental limit property of probability theory. This limit property, which re- 
quires somewhat sophisticated technique to establish, justifies a relative frequency interpretation 
of probability. 

The procedure arrival assumes the setup provided by chainset, including a set E of target states. 
The procedure prompts for the number r of repetitions and the initial state. Then it produces r 
succesive trajectories, each starting with the prescribed initial state and ending on one of the target 
states. The arrival times vary from one run to the next. Various statistics are computed and 
displayed or made available. In the single-run case (r = 1), the trajectory may be displayed. An 
auxiliary procedure plotdbn may be used in the multirun case to plot the distribution of arrival 
times. 
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Example 16.16: Arrival time to a target set of states 



rand( 'seed' ,0) 
arrival 

Enter the number of repetitions 1 
The target state set is: 

16 22 25 
Enter the initial state 34 

The arrival time is 6 
The state reached is 16 
To view the trajectory of states, call for TR 



°/. Assumes chainset has been run, as above 
'/, Single run case 



7, Specified initial state 
'/, Data on trajectory 



5 
37 



1000 



disp(TR') 

12 3 4 
34 13 10 28 34 
rand(' seed' ,0) 
arrival 

Enter the number of repetitions 
The target state set is: 

16 22 25 
Enter the initial state 34 

The result of 1000 repetitions 
Term state Rel Freq Av time 

16.0000 0.3310 3.3021 

22.0000 0.3840 3.2448 

25.0000 0.2850 4.3895 
The average arrival time is 3.59 
The standard deviation is 3.207 
The minimum arrival time is 1 
The maximum arrival time is 23 

To view the distribution of arrival times, call for dbn 
To plot the arrival time distribution, call for plotdbn 
plotdbn °/ See Figure~16.5 



'/, Optional call to view trajectory 



16 



'/. Call for 1000 repetitions 



'/, Specified initial state 

'/, Run data (see optional calls below) 
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0.35 r 



Time Distribution 



0.3 



0.25 



3 0.2 



ro 0.15 



0.1 



0.05 



O O 



O O o o 



CD O O O O i 

10 15 20 

Time in number of transitions 



25 



Figure 16.5: Time distribution for Example 16.16 ( Arrival time to a target set of states). 



It would be difficult to establish analytically estimates of arrival times. The simulation procedure 
gives a reasonable "feel" for these times and how they vary. 

The procedure recurrence is similar to the procedure arrival. If the initial state is not in the 
target set, it behaves as does the procedure arrival and stops on the first visit to the target set. 
However, if the initial state is in the target set, the procedures are different. The procedure arrival 
stops with zero transitions, since it senses that it has "arrived." We are usually interested in having 
at least one transition- back to the same state or to another state in the target set. We call these 
times recurrence times. 

Example 16.17 



rand( 'seed' ,0) 
recurrence 

Enter the number of repititions 1 
The target state set is: 

16 22 25 
Enter the initial state 22 
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Time Distribution 



o 



o 



o o ° 

_j i i Q o Q o e 

6 8 10 12 14 16 

Time in number of transitions 



-6 
20 



Figure 16.6: Transition time distribution for Example 16.17. 



The recurrence time is 1 
The state reached is 16 

To view the trajectory of state numbers, call for TR 
disp(TR') 1 

22 16 

recurrence 

Enter the number of repititions 1000 
The target state set is: 

16 22 25 

Enter the initial state 25 
The result of 1000 repetitions is: 
Term state Rel Freq Av time 

16.0000 0.3680 2.8723 

22.0000 0.2120 4.6745 

25.0000 0.4200 3.1690 

The average recurrence time is 3.379 
The standard deviation is 3.0902 
The minimum recurrence time is 1 
The maximum recurrence time is 20 

To view the distribution of recurrence times, call for dbn 
To plot the recurrence time distribution, call for plotdbn 
'/. See Figure" 16. 6 

The procedure kvis stops when a designated number k of states are visited. If k is greater than the number 
of target states, or if no k is designated, the procedure stops when all have been visited. For k = 1, the 
behavior is the same as arrival. However, that case is better handled by the procedure arrival, which provides 
more statistics on the results. 



Example 16.18 
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rand( 'seed' ,0) 
kvis °/. Assumes chainset has been run 

Enter the number of repetitions 1 
The target state set is: 

16 22 25 
Enter the number of target states to visit 2 
Enter the initial state 34 
The time for completion is 7 

To view the trajectory of states, call for TR 
disp(TR') 

01234567 

34 13 10 28 34 37 16 25 

rand(' seed' ,0) 
kvis 

Enter the number of repetitions 100 
The target state set is: 

16 22 25 

Enter the number of target states to visit '/, Default-- visit all three 

Enter the initial state 31 
The average completion time is 17.57 
The standard deviation is 8.783 
The minimum completion time is 5 
The maximum completion time is 42 
To view a detailed count, call for D. 
The first column shows the various completion times; 
the second column shows the numbers of trials yielding those times 

The first goal of this somewhat sketchy introduction to Markov processes is to provide a general setting which 
gives insight into the essential character and structure of such systems. The important case of homogenous 
chains is introduced in such a way that their algebraic structure appears as a logical consequence of the 
Markov propertiy. The general theory is used to obtain some tools for formulating homogeneous chains in 
practical cases. Some MATLAB tools for studying their behavior are applied to an artificial example, which 
demonstrates their general usefulness in studying many practical, applied problems. 

16.3 Problems on Conditional Independence, Given a Random 
Vector 3 

Exercise 16.1 (Solution on p. 527.) 

The pair {X, Y}ci\H. X ~ exponential (w/3), given H = u; Y ~ exponential (u/5), given 
H = u; and H ~ uniform [1,2]. Determine a general formula for P (X > r, Y > s), then evaluate 
for r = 3, s = 10. 

Exercise 16.2 (Solution on p. 527.) 

A small random sample of size n = 12 is taken to determine the proportion of the student body 
which favors a proposal to expand the student Honor Council by adding two additional members 
"at large." Prior information indicates that this proportion is about 0.6 = 3/5. From a Bayesian 
point of view, the population proportion is taken to be the value of a random variable H. It seems 
reasonable to assume a prior distribution H ~ beta (4,3), giving a maximum of the density at 
(4 — 1) / (4 + 3 — 2) = 3/5. Seven of the twelve interviewed favor the proposition. What is the best 



3 This content is available online at <http://cnx.Org/content/m24604/l.4/>. 
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mean-square estimate of the proportion, given this result? What is the conditional distribution of 
H, given this result? 

Exercise 16.3 (Solution on p. 527.) 

Let {Xi : 1 < i < n) be a random sample, given H. Set W = (X\,X2,- ■ ■ ,X n ). Suppose X 
conditionally geometric (u), given H = u; i.e., suppose P (X = k\H = u) = «(1 — u) for all k > 0. 
If H ~ uniform 
on [0, 1], determine the best mean square estimator for H, given W. 

Exercise 16.4 (Solution on p. 527.) 

Let {Xi : 1 < i < n) be a random sample, given H. Set W = (X\,X2,- ■ ■ ,X n ). Suppose X 
conditionally Poisson (u), given H = u; i.e., suppose P (X = k\H = u) = e~ u u k /k\. If iJ ~ gamma 
(m, A), determine the best mean square estimator for iT, given W. 

Exercise 16.5 (Solution on p. 527.) 

Suppose {N, H} is independent and {N, Y} ci \H. Use properties of conditional expectation and 
conditional independence to show that 



E [g (N) h (Y) \H] = E[g (TV)] E [h (Y) \H] a.s. 



(16.82) 



Exercise 16.6 (Solution on p. 527.) 

Consider the composite demand D introduced in the section on Random Sums (Section 15.1.2: A 
useful model — random sums) in "Random Selecton" 



D = Z) J M ^ Xn where Xn = Z yfe ' y ° = ° ( 16 - 83 ) 

Suppose {N,H} is independent, {N,Yi} ci\H for all i, and S[Yi|iJ] = e(H), invariant with i. 
Show that E [D\H] = E [N] E [Y\H] a.s. . 

Exercise 16.7 (Solution on p. 528.) 

The transition matrix P for a homogeneous Markov chain is as follows (in m-file nprl6_07.m 
(Section 17.8.44: nprl6_07)): 



0.23 0.32 0.02 0.22 0.21 

0.29 0.41 0.10 0.08 0.12 

0.22 0.07 0.31 0.14 0.26 

0.32 0.15 0.05 0.33 0.15 

0.08 0.23 0.31 0.09 0.29 



(16.84) 



a. Obtain the absolute values of the eigenvalues, then consider increasing powers of P to observe 
the convergence to the long run distribution. 

b. Take an arbitrary initial distribution pO (as a row matrix). The product pO * P k is the 
distribution for stage k. Note what happens as k becomes large enough to give convergence to 
the long run transition matrix. Does the end result change with change of initial distribution 
pO? 
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Exercise 16.8 (Solution on p. 528.) 

The transition matrix P for a homogeneous Markov chain is as follows (in m-file nprl6_08.m): 



P 



0.2 


0.5 


0.3 














0.6 


0.1 


0.3 














0.2 


0.7 


0.1 























0.6 


0.4 

















0.5 


0.5 








0.1 


0.3 





0.2 


0.1 


0.1 


0.2 


0.1 


0.2 


0.1 


0.2 


0.2 


0.2 






(16.85) 



a. Note that the chain has two subchains, with states {1,2,3} and {4,5}. Draw a transition 
diagram to display the two separate chains. Can any state in one subchain be reached from 
any state in the other? 

b. Check the convergence as in part (a) of Exercise 16.7. What happens to the state probabilities 
for states 6 and 7 in the long run? What does that signify for these states? Can these states 
be reached from any state in either of the subchains? How would you classify these states? 

Exercise 16.9 (Solution on p. 528.) 

The transition matrix P for a homogeneous Markov chain is as follows (in m-file nprl6_09.m 
(Section 17.8.45: nprl6_09)): 



P 



0.1 


0.2 


0.1 


0.3 


0.2 





0.1 





0.6 














0.4 








0.2 


0.5 





0.3 











0.6 


0.1 





0.3 





0.2 


0.2 


0.1 


0.2 





0.1 


0.2 








0.2 


0.7 





0.1 








0.5 














0.5 



(16.86) 



a. Check the transition matrix P for convergence, as in part (a) of Exercise 16.7. How many 
steps does it take to reach convergence to four or more decimal places? Does this agree with 
the theoretical result? 

b. Examine the long run transition matrix. Identify transient states. 

c. The convergence does not make all rows the same. Note, however, that there are two sub- 
groups of similar rows. Rearrange rows and columns in the long run Matrix so that identical 
rows are grouped. This suggests subchains. Rearrange the rows and columns in the transition 
matrix P and see that this gives a pattern similar to that for the matrix in Exercise 16.8. 
Raise the rearranged transition matrix to the power for convergence. 

Exercise 16.10 (Solution on p. 529.) 

Use the m-procedure inventory 1 (in m-file inventory l.m) to obtain the transition matrix for 
maximum stock M = 8, reorder point m = 3, and demand D ~ Poisson(4). 



a. Suppose initial stock is six. What will the distribution for X n , n 
the end of periods 1, 3, 5, before restocking)? 



1,3,5 (i.e., the stock at 
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b. What will the long run distribution be? 
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Solutions to Exercises in Chapter 16 

Solution to Exercise 16.1 (p. 523) 



-ur/3 — us/5 p — au 



P (X > r,Y > s\H = u) = e - ur ' 6 e- us/b = e~ au , a = - + - (16.87) 

3 5 



P (X > r, Y > s) = J e- au f H (u) du = f e~ au du = - [e~ a - e~ 2a ] 



Solution to Exercise 16.2 (p. 523) 

H ~ Beta (r, s),r = 4, s = 3, n = 12, fc = 7 



n+ 1 



n + k* + 2 
Solution to Exercise 16.4 (p. 524) 



(16.88) 



For r = 3, s = 10, a = 3, P (X > 3, Y > 10) = - (e -3 - e" 6 ) = 0.0158 (16.89) 



E[H\S = k]= k + r = 7 + 4 = 11 (16.90) 

11 J n+r + s 12 + 4 + 3 19 V ; 

Solution to Exercise 16.3 (p. 524) 

ffi?W = fc = L r { \ ' S = r r l {i ,\ , A (16.91) 

1 ' J S[/ {fc} (W)] £{i?[J {fc} (W0|ff]} k ; 

= SuP[W = k \H = u ) f H iu ) du k={kiha kn) (1692) 

n n 

P{W = k\H = u) = Y[u{l-u) k ' = u n {l-uf k* = Y,k t (16.93) 

i=l i=\ 

pririw h /o" n+1 (i-") fc "^ r(n + 2)r(r + i) r(n + r + 2) 

S[i/| ^ = " ] = />(i - U )*' du = r ( n + i + ^ + 2) • r^TWWTY) = ^ 



(16.95) 



r , n f M p(jy = k\H = u)f H (u) du 
E [H\W = k]= J \ > JH{ ' 16.96 

n k- k* n 

P<W = k\H = u) = Y\e- u ^- = e~ nu ^— k* = V k t (16.97) 

xx kjl A *-^ 

A m u m-l e -A« 

/h («) = wj—, 16.98 

1 (m) 

Fmlw ... C^+^e-^")"^ rCm + T + 1) (A + n) fc ' +m m + fc* 

11 J J oo ^*+™-i e -^+«)«d M {X + n) k ' +m+1 ' T(m + k*) ' A + n l ^^ 

Solution to Exercise 16.5 (p. 524) 

E [g (N) h (H) \H] = E[g (N) \H] E [h (Y) \H] a.s. by (CI6) ("(CI6) ", p. 602) and 

E [g (N) \H] = E[g (N)] a.s. by (CE5) ("(CE5)", p. 601). 
Solution to Exercise 16.6 (p. 524) 

oo 

E [D\H} = J2 E[l {n} (N) X n \H] a.s. (16.100) 
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E[l {n} (N)X n \H] = J2 n k=1 E[l {n} (N)Y k \H] = £Li P (N = n) E [Y\H] = (16.101) 
P(N = n)nE [Y | H] a.s. 



E [D\H] = Y j nP{N = n)E [Y\H] = E [N] E [Y\H] a. 



(16.102) 



Solution to Exercise 16.7 (p. 524) 



ev = abs(eig(P)) ' 
ev = 1.0000 0.0814 0.0814 0.3572 0.2429 

a = ev(4) .-[2 4 8 16 24] 

a = 0.1276 0.0163 0.0003 0.0000 0.0000 

7, By P~16 the rows agree to four places 
pO = [0.5 0.3 0.2] ; 
p4 = pO+P'4 
p4 = 0.2297 

p8 = pO+P'8 



p8 = 0.2290 

pl6 = pO*P~16 
pl6 = 0.2289 
pOa = [0 1] 
pl6a = pOa+P'16 
pl6a = 0.2289 



0.2622 
0.2611 
0.2611 

0.2611 



°/» An arbitrarily chosen pO 
0.1444 0.1644 0.1992 

0.1462 0.1638 0.2000 



0.1462 0.1638 0.2000 

°/. A second choice of pO 



0.1462 



0.1638 



0.2000 



Solution to Exercise 16.8 (p. 524) 

Increasing power P n show the probability of being in states 6, 7 go to zero. These states cannot be reached 
from any of the other states. 
Solution to Exercise 16.9 (p. 525) 

Examination of P 16 suggests sets {2, 7} and {3,4, 6} of states form subchains. Rearrangement of P may be 
done as follows: 



PA = P([2 


7 3 4 6 


15], [2 7 


3 4 6 1 


5]) 










PA = 


















0.6000 


0.4000 





















0.5000 


0.5000 



























0.2000 


0.5000 


0.3000 


















0.6000 


0.1000 


0.3000 


















0.2000 


0.7000 


0.1000 












0.2000 


0.1000 


0.1000 


0.3000 








,1000 


0, 


,2000 


0.2000 


0.2000 


0.1000 


0.2000 


0.1000 


0. 


,2000 







PA16 = PA~16 


















PA16 = 


















0.5556 


. 4444 





















0.5556 


. 4444 



























0.3571 


0.3929 


0.2500 


















0.3571 


0.3929 


0.2500 


















0.3571 


0.3929 


0.2500 












0.2455 


0.1964 


0.1993 


0.2193 


0.1395 





,0000 


0, 


,0000 


0.2713 


0.2171 


0.1827 


0.2010 


0.1279 


0. 


,0000 


0, 


,0000 
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It is clear that original states 1 and 5 are transient. 
Solution to Exercise 16.10 (p. 525) 



inventoryl 
Enter value M of maximum stock 8 
Enter value m of reorder point 3 
Enter row vector of demand values 0:20 
Enter demand probabilities ipoisson(4,0 : 20) 
Result is in matrix P 
pO = [0 1 0] ; 
pi = p0*P 
pi = 

Columns 1 

0.2149 



through 7 
0.1563 
Columns 8 through 9 


p3 = pO*P~3 
p3 = 

Columns 1 through 7 

0.2494 0.1115 

Columns 8 through 9 
0.0391 0.0096 

p5 = pO+P'5 
p5 = 

Columns 1 through 7 

0.2598 0.1124 

Columns 8 through 9 
0.0386 0.0095 

a = abs(eig(P) ) ' 



0.1954 



0.1954 



0.1465 



0.0733 



0.0183 



0.1258 



0.1338 



0.1331 



0.1165 



0.0812 



0.1246 



0.1311 



0.1300 



0.1142 



0.0799 



Columns 1 through 7 

1.0000 0.4427 

Columns 8 through 9 
0.0000 0.0000 

a(2)~16 
ans = 

2.1759e-06 
pinf = pO*P~16 
pinf = Columns 1 

0.2622 0.1132 

Columns 8 through 9 

0.0380 0.0093 



0.1979 



0.0284 



0.005E 



0.0005 



0.0000 



'/, Convergence to at least five decimals for P~16 
°/. Use arbitrary pO, pinf approx pO*P~16 
through 7 

0.1251 0.1310 0.1292 0.1130 0.07E 



„ n CHAPTER 16. CONDITIONAL INDEPENDENCE, GIVEN A RANDOM 

MU VECTOR 



Chapter 17 

Appendices 



17.1 Appendix A to Applied Probability: Directory of m- functions 
and m-procedures 1 

We use the term m-function to designate a user-defined function as distinct from the basic MATLAB functions 
which are part of the MATLAB package. For example, the m-function minterm produces the specified 
minterm vector. An m-procedure (or sometimes a procedure) is an m-file containing a set of MATLAB 
commands which carry out a prescribed set of operations. Generally, these will prompt for (or assume) 
certain data upon which the procedure is carried out. We use the term m-program to refer to either an 
m-function or an m-procedure. 

In addition to the m-programs there is a collection of m-files with properly formatted data which can be 
entered into the workspace by calling the file. 

Although the m-programs were written for MATLAB version 4.2, they work for versions 5.1, 5.2, and 
7.04. The latter versions offer some new features which may make more efficient implementation of some 
of the m-programs, and which make possible some new ones. With one exception (so noted), these are not 
explored in this collection. 

17.1.1 MATLAB features 

Utilization of MATLAB resources is made possible by a systematic analysis of some features of the basic 
probability model. In particular, the minterm analysis of logical (or Boolean) combinations of events and 
the analysis of the structure of simple random variables with the aid of indicator functions and minterm 
analysis are exploited. 

A number of standard features of MATLAB are utilized extensively. In addition to standard matrix 
algebra, we use: 

1. Array arithmetic. This involves element by element calculations. For example, if a, b are matrices 
of the same size, then a.*b is the matrix obtained by multiplying corresponding elements in the two 
matrices to obtain a new matrix of the same size. 

2. Relational operations, such as less than, equal, etc. to obtain zero-one matrices with ones at element 
positions where the conditions are met. 

3. Logical operations on zero-one matrices utilizing logical operators and, or, and not, as well as certain 
related functions such as any, all, not, find, etc. Note. Relational operations and logical operations 
produce zero-one arrays, called logical arrays, which MATLAB treats differently from zero-one numeric 
arrays. A rectangular array in which some rows are logical arrays but others are not is treated as a 



1 This content is available online at <http://cnx.Org/content/m23942/l.7/>. 
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numeric array. Any zero-one rectangular array can be converted to a numeric array (matrix) by the 
command A = ones (size (A) ) . *A, 
4. Certain MATLAB functions, such as meshgrid, sum, cumsum, prod, cumprod are used repeatedly. 
The function dot for dot product does not work if either array is a logical array. If one of the pair is 
numeric, the command C = A*B' will work. 



17.1.2 Auxiliary user-defined building blocks 

csort.m 17.1 
Description of Code 

One of the most useful is a special sorting and consolidation operation implemented in the m- 
function csort. A standard problem arises when each of a non distinct set of values has an associated 
probability. To obtain the distribution, it is necessary to sort the values and add the probabilities 
associated with each distinct value. The following m-function achieves these operations: function 
[t,p] = csort(T,P). T and P are matrices with the same number of elements. Values of T are sorted 
and identical values are consolidated; values of P corresponding to identical values of T are added. 
A number of derivative functions and procedures utilize csort. The following two are useful. 

Code 



function [t,p] = csort (T,P) 

'/. CSORT [t,p] = csort (T,P) Sorts T, consolidates P 

'/. Version of 4/6/97 

°/» Modified to work with Versions 4.2 and 5.1, 5.2 

'/, T and P matrices with the same number of elements 

'/, The vector T(:)' is sorted: 

'/, * Identical values in T are consolidated; 

'/, * Corresponding values in P are added. 

T = T(:)'; 

n = length(T) ; 

[TS,I] = sort(T); 

d = find([l,TS(2:n) - TS ( 1 : n- 1 ) >le-13]); '/. Determines distinct values 

t = TS(d); '/, Selects the distinct values 

m = length (t) + 1; 

P = P(I); '/, Arranges elements of P 

F = [0 cumsum(P(:) ')] ; 

Fd = F([d length(F)]); '/, Cumulative sums for distinct values 

p = Fd(2:m) - Fd(l:m-1); '/, Separates the sums for these values 



distinct, m 17.2 
Description of Code 
distinct. m function y = distinct (T) determines and sorts the distinct members of matrix T. 

Code 



function y = distinct (T) 

'/. DISTINCT y = distinct (T) Disinct* members of T 
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'/. Version of 5/7/96 Rev 4/20/97 for version 4 & 5.1, 5.2 

'/, Determines distinct members of matrix T. 

'/, Members which differ by no more than 10~{-13}- 

'/, are considered identical, y is a row 

'/, vector of the distinct members. 

TS = sort(T(:)'); 

n = length (TS) ; 

d = [1 abs(TS(2:n) - TS(l:n-l)) >le-13] ; 

y = TS(find(d)); 



freq.m 17.3 

Description of Code 

freq.m sorts the distinct members of a matrix, counts the number of occurrences of each value, 
and calculates the cumulative relative frequencies. 

Code 



7, FREQ file freq.m Frequencies of members of matrix 

'/. Version of 5/7/96 

7, Sorts the distinct members of a matrix, counts 

'/, the number of occurrences of each value, and 

'/, calculates the cumulative relative frequencies. 

T = input ('Enter matrix to be counted '); 

[m,n] = size(T) ; 

[t,f] = csort (T,ones(m,n) ) ; 

p = cumsum(f )/(m*n) ; 

disp(['The number of entries is ' ,num2str (m*n) ,] ) 

disp(['The number of distinct entries is ' ,num2str (length(t) ) ,] ) 

dispC ') 

dis = [t;f ;p] ' ; 

disp(' Values Count Cum Frac') 

disp(dis) 



dsum.m 17.4 

Description of Code 

dsum.mf unction y = dsum(v,w) determines and sorts the distinct elements among the sums of 
pairs of elements of row vectors v and w. 

Code 



function y = dsum(v,w) 

7, DSUM y = dsum(v,w) Distinct pair sums of elements 

'/. Version of 5/15/97 

'/, y is a row vector of distinct 

'/, values among pair sums of elements 

'/, of matrices v, w. 
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'/, Uses m-function distinct 
[a,b] = meshgrid(v,w) ; 
t = a+b; 
y = distinct (t(:) ') ; 



rep.m 17.5 
Description of Code 

rep.mfunction y = rep (A, m,n) replicates matrix A, m times vertically and n times horizontally. 
Essentially the same as the function repmat in MATLAB version 5, released December, 1996. 

Code 



function y = rep(A,m,n) 

'/, REP y = rep(A,m,n) Replicates matrix A 

'/. Version of 4/21/96 

'/, Replicates A, 

°/. m times vertically, 

'/, n times horizontally 

'/, Essentially the same as repmat in version 5.1, 5.2 

[r , c] = size (A) ; 



R = [1 


r]>; 


C = [1 


c]'; 


v = R( 


,ones(l ,m) ) ; 


w = C( 


,ones(l,n)) ; 



A(v,w); 



elrep.m 17.6 
Description of Code 

elrep.mfunction y = elrep (A, m,n) replicates each element of A, m times vertically and n times 
horizontally. 

Code 



function y = elrep(A,m,n) 

'/, ELREP y = elrep(A,m,n) Replicates elements of A 

'/. Version of 4/21/96 

'/, Replicates each element, 

°/. m times vertically, 

'/, n times horizontally 

[r , c] = size (A) ; 

R = l:r; 

C = l:c; 

v = R(ones(l ,m) , : ) ; 

w = C(ones(l ,n) , : ) ; 

y = A(v,w) ; 
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kronf.m 17.7 
Description of Code 

kronf.mf unction y = kronf (A,B) determines the Kronecker product of matrices A, 
Achieves the same result for full matrices as the MATLAB function kron. 

Code 



function y = kronf (A, B) 
'/. KRONF y = kronf (A, B) Kronecker product 
'/. Version of 4/21/96 

°/. Calculates Kronecker product of full matrices. 
'/, Uses m-f unctions elrep and rep 

°/. Same result for full matrices as kron for version 5.1, 5.2 
[r , c] = size(B) ; 
[m,n] = size (A) ; 
y = elrep(A,r , c) . *rep(B,m,n) ; 



colcopy.m 17.8 

Description of Code 

colcopy.mf unction y = colcopy(v,n) treats row or column vector v as a column vector and 
makes a matrix with n columns of v. 

Code 



function y = colcopy(v,n) 
°/. COLCOPY y = colcopy(v,n) n columns of v 
'/, Version of 6/8/95 (Arguments reversed 5/7/96) 
'/, v a row or column vector 
'/, Treats v as column vector 
'/, and makes n copies 
'/. Procedure based on "Tony's trick" 
[r,c] = size(v) ; 
if r == 1 
v = v' ; 
end 



y = v(: ,ones(l,n)) ; 



colcopyi.m 17.9 

Description of Code 

colcopyi.mf unction y = colcopyi(v,n) treats row or column vector v as a column vector, 
reverses the order of the elements, and makes a matrix with n columns of the reversed vector. 

Code 



function y = colcopyi(v,n) 
'/, COLCOPYI y = colcopyi(v,n) n columns in reverse order 
'/. Version of 8/22/96 
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7, v a row or column vector. 

'/, Treats v as column vector, 

'/. reverses the order of the 

7, elements, and makes n copies. 

'/, Procedure based on "Tony's trick" 

N = ones(l,n) ; 

[r,c] = size(v) ; 

if r == 1 



:D: 



V 


= v(c : - 


-1 


else 






V 


= v(r : 


-1 


end 






y = 


v(:,N) 


» 



rowcopy.m 17.10 

Description of Code 

rowcopy.mf unction y = rowcopy (v,n) treats row or column vector v as a row vector and makes 
a matrix with n rows of v. 

Code 



function y = rowcopy (v,n) 
'/, ROWCOPY y = rowcopy (v,n) n rows of v 
'/. Version of 5/7/96 
'/, v a row or column vector 
'/, Treats v as row vector 
'/, and makes n copies 
'/, Procedure based on "Tony's trick" 
[r,c] = size(v) ; 
if c == 1 
v = v' ; 
end 
y = v(ones(l,n) , :) ; 



repseq.m 17.11 
Description of Code 

repseq.mf unction y = repseq(V,n) replicates vector V n times — horizontally if V is a row 
vector and vertically if V is a column vector. 

Code 



function y = repseq(V,n); 
'/, REPSEQ y = repseq(V,n) Replicates vector V n times 
'/. Version of 3/27/97 
7, n replications of vector V 
'/, Horizontally if V a row vector 
'/, Vertically if V a column vector 
m = length(V) ; 
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s = rem(0 :n*m-l,m)+l ; 
y = V(s); 



total.m 17.12 
Description of Code 
total.m Total of all elements in a matrix, calculated by: total (x) = sum(sum(x)). 

Code 



function y = total (x) 

'/. TOTAL y = total (x) 

'/. Version of 8/1/93 

'/, Total of all elements in matrix x. 

y = sum ( sum (x) ) ; 



dispv.m 17.13 
Description of Code 
dispv.m Matrices A, B are transposed and displayed side by side. 

Code 



function y = dispv(A,B) 

'/. DISPV y = dispv(A,B) Transpose of A, B side by side 

'/. Version of 5/3/96 

7, A, B are matrices of the same size 

'/, They are transposed and displayed 

'/, side by side. 

y = [A;B]'; 



roundn.m 17.14 
Description of Code 
roundn.mf unction y = roundn(A,n) rounds matrix A to n decimal places. 

Code 



function y = roundn(A,n); 

'/. ROUNDN y = roundn(A.n) 

'/. Version of 7/28/97 

'/, Rounds matrix A to n decimals 

y = round (A*10~n)/10~n; 
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arrep.m 17.15 

Description of Code 

arrep.mf unction y = arrep(n,k) forms all arrangements, with repetition, of k elements from 
the sequence 1 : n. 

Code 



function y = arrep(n,k); 

'/. ARREP y = arrep(n,k); 

'/. Version of 7/28/97 

'/, Computes all arrangements of k elements of l:n, 

'/, with repetition allowed, k may be greater than n. 

'/, If only one input argument n, then k = n. 

'/, To get arrangements of column vector V, use 

'/. V (arrep (length (V), k) ) . 

N = l:n; 

if nargin == 1 

k = n; 
end 

y = zeros(k,n~k) ; 
for i = l:k 

y(i,:) = rep(elrep(N,l,n~(k-i)),l,n~(i-l)); 
end 



17.1.3 Minterm vectors and probabilities 

The analysis of logical combinations of events (as sets) is systematized by the use of the minterm expansion. 
This leads naturally to the notion of minterm vectors. These are zero-one vectors which can be combined 
by logical operations. Production of the basic minterm patterns is essential to a number of operations. The 
following m-programs are key elements of various other programs. 

minterm. m 17.16 
Description of Code 
minterm.mfunction y = minterm(n,k) generates the irth minterm vector in a class of n. 

Code 



function y = minterm(n,k) 

'/, MINTERM y = minterm(n,k) kth minterm of class of n 

'/. Version of 5/5/96 

'/, Generates the kth minterm vector in a class of n 

'/, Uses m-f unction rep 

y = rep( [zeros (l,2~(n-k)) ones(l ,2~ (n-k) )] , 1 ,2~ (k-1)) ; 
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mintable.m 17.17 

Description of Code 

mintable.mf unction y = mintable(n) generates a table of minterm vectors by repeated use of 
the m-function minterm. 

Code 



function y = mintable(n) 

'/, MINTABLE y = mintable(n) Table of minterms vectors 

'/. Version of 3/2/93 

'/, Generates a table of minterm vectors 

°/. Uses the m-function minterm 

y = zeros(n,2~n) ; 

for i = l:n 

y(i,:) = minterm(n, i) ; 
end 



minvec3.m 17.18 

Description of Code 

minvec3.m sets basic minterm vectors A, B, C, A c , B c , C c for the class {A, B, C}. (Similarly for 
minvec4.m, minvec5.m, etc.) 

Code 

°/. MINVEC3 file minvec3.m Basic minterm vectors '/, Version of 1/31/95 A = 
minterm(3, 1) ; B = minterm(3,2) ; C = minterm(3,3) ; Ac = ~A; Bc = ~B; Cc = 
~C; disp( 'Variables are A, B, C, Ac, Bc, Cc') disp('They may be renamed, if 
desired. ' ) 



minmap 17.19 

Description of Code 

minmapf unction y = minmap (pm) reshapes a row or column vector pm of minterm probabilities 
into minterm map format. 

Code 



function y = minmap (pm) 

'/, MINMAP y = minmap(pm) Reshapes vector of minterm probabilities 

'/. Version of 12/9/93 

'/, Reshapes a row or column vector pm of minterm 

'/, probabilities into minterm map format 

m = length (pm) ; 

n = round (log (m) /log (2) ) ; 

a = fix(n/2) ; 

if m ~= 2~n 

disp('The number of minterms is incorrect') 
else 

y = reshape (pm,2~a, 2" (n-a) ) ; 
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end 



binary.m 17.20 

Description of Code 

binary.mf unction y = binary(d,n) converts a matrix d of floating point nonnegative integers 
to a matrix of binary equivalents, one on each row. Adapted from m-functions written by Hans 
Olsson and by Simon Cooke. Each matrix row may be converted to an unspaced string of zeros 
and ones by the device ys = setstr(y + '0'). 

Code 



function y = binary (d,n) 

7, BINARY y = binary(d,n) Integers to binary equivalents 

'/. Version of 7/14/95 

'/, Converts a matrix d of floating point, nonnegative 

'/, integers to a matrix of binary equivalents. Each row 

'/, is the binary equivalent (n places) of one number. 

7, Adapted from the programs dec2bin.m, which shared 

'/, first prize in an April 95 Mathworks contest. 

'/, Winning authors: Hans Olsson from Lund, Sweden, 

'/, and Simon Cooke from Glasgow, UK. 

'/, Each matrix row may be converted to an unspaced string 

'/, of zeros and ones by the device: ys = setstr(y + '0'). 

if nargin < 2, n = 1; end '/, Allows omission of argument n 

[f,e] = log2(d); 

n = max (max (max (e) ) ,n) ; 

y = rem(floor(d(:)*pow2(l-n:0)) ,2) ; 



mincalc.m 17.21 

Description of Code 

mincalc.m The m-procedure mincalc determines minterm probabilities from suitable data. For a 
discussion of the data formatting and certain problems, see 2.6. 

Code 



'/, MINCALC file mincalc.m Determines minterm probabilities 

'/. Version of 1/22/94 Updated for version 5 . 1 on 6/6/97 

'/, Assumes a data file which includes 

'/, 1. Call for minvecq to set q basic minterm vectors, each (1 x 2~q) 

'/, 2. Data vectors DV = matrix of md data Boolean combinations of basic sets- 

'/, Matlab produces md minterm vectors-- one on each row. 

'/, The first combination is always A I Ac (the whole space) 

7, 3 . DP = row matrix of md data probabilities. 

'/, The first probability is always 1. 

'/, 4. Target vectors TV = matrix of mt target Boolean combinations. 

'/, Matlab produces a row minterm vector for each target combination. 
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'/, If there are no target combinations, set TV = [] ; 

[md,nd] = size(DV); 

ND = 0:nd-l; 

ID = eye(nd); '/, Row i is minterm vector i-1 

[mt,nt] = size (TV) ; 

MT = l:mt; 

rd = rank(DV) ; 

if rd < md 

disp('Data vectors are NOT linearly independent') 

else 
disp('Data vectors are linearly independent') 
end 

'/, Identification of which minterm probabilities can be determined from the data 
'/, (i.e., which minterm vectors are not linearly independent of data vectors) 
AM = zeros (l,nd) ; 
for i = l:nd 

AM(i) = rd == rank( [DV; ID(i, : )] ) ; '/, Checks for linear dependence of each 

end 
am = f ind(AM) ; '/, minterm vector 

CAM = ID(am,:)/DV; '/, Determination of coefficients for the available minterms 
pma = DP+CAM'; °/ d Calculation of probabilities of available minterms 

PMA = [ND(am) ;pma] ' ; 
if sum(pma < -0.001) > % Check for data consistency 

disp('Data probabilities are INCONSISTENT') 
else 

'/, Identification of which target probabilities are computable from the data 
CT = zeros(l,mt) ; 
for j = l:mt 

CT(j) = rd == rank([DV;TV(j,:)]); 

end 
ct = f ind(CT) ; 

CCT = TV(ct,:)/DV; % Determination of coefficients for computable targets 

ctp = DP+CCT'; '/, Determination of probabilities 

disp(' Computable target probabilities') 
disp([MT(ct); ctp]') 

end */. end for "if sum(pma < -0.001) > 0" 

disp(['The number of minterms is ' ,num2str (nd) ,] ) 

disp(['The number of available minterms is ' ,num2str (length (pma) ) ,] ) 
disp(' Available minterm probabilities are in vector pma') 
disp('To view available minterm probabilities, call for PMA') 



mincalct.m 17.22 

Description of Code 

mincalct.m Modification of mincalc. Assumes mincalc has been run, calls for new target vectors 
and performs same calculations as mincalc. 

Code 
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'/, MINCALCT file mincalct.m Aditional target probabilities 

'/. Version of 9/1/93 Updated for version 5 on 6/6/97 

7, Assumes a data file which includes 

7, 1. Call for minvecq to set q basic minterm vectors. 

7, 2. Data vectors DV. The first combination is always A I Ac. 

7, 3. Row matrix DP of data probabilities. The first entry is always 1. 

TV = input('Enter matrix of target Boolean combinations '); 

[md,nd] = size(DV); 

[mt,nt] = size (TV) ; 

MT = l:mt; 

rd = rank(DV); 

CT = zeros(l,mt); '/, Identification of computable target probabilities 

for j = l:mt 

CT(j) = rd == rank([DV;TV(j,:)]); 
end 

ct = f ind(CT) ; 

CCT = TV(ct,:)/DV; °/. Determination of coefficients for computable targets 
ctp = DP+CCT'; '/, Determination of probabilities 
disp(' Computable target probabilities') 
disp([MT(ct); ctp]') 



17.1.4 Independent events 

minprob.m 17.23 

Description of Code 

minprob.mfunction y = minprob(p) calculates minterm probabilities for the basic probabilities 
in row or column vector p. Uses the m-functions mintable, colcopy. 

Code 



function y = minprob(p) 

'/, MINPROB y = minprob(p) Minterm probs for independent events 

'/. Version of 4/7/96 

'/. p is a vector [P(A1) P(A2) ... P(An)], with 

'/, {A1,A2, ... An} independent. 

'/, y is the row vector of minterm probabilities 

'/, Uses the m-functions mintable, colcopy 

n = length (p) ; 

M = mintable (n) ; 

a = colcopy (p,2~n) ; '/, 2~n columns, each the vector p 

m = a.*M + (1 - a).*(l - M) ; '/, Puts probabilities into the minterm 

°/» pattern on its side (n by 2~n) 
y = prod(m) ; °/. Product of each column of m 
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imintest.m 17.24 
Description of Code 
imintest.mf unction y = imintest(pm) checks minterm probabilities for independence. 

Code 



function y = imintest(pm) 

7, IMINTEST y = imintest(pm) Checks minterm probs for independence 

'/. Version of 1/25//96 

7, Checks minterm probabilities for independence 

7, Uses the m-functions mintable and minprob 

m = length (pm) ; 

n = round (log (m) /log (2) ) ; 

if m ~= 2~n 

y = 'The number of minterm probabilities is incorrect'; 

else 

P = mintable (n) *pm' ; 

pt = minprob (P'); 

a = fix(n/2) ; 

s = abs(pm - pt) > le-7; 
if sum(s) > 

disp('The class is NOT independent') 

disp( 'Minterms for which the product rule fails') 

y = reshape(s,2~a,2~(n-a)) ; 



else 

y = 

end 
end 



'The class is independent'; 



ikn.m 17.25 
Description of Code 

ikn.mf unction y = ikn(P,k) determines the probability of the occurrence of exactly k of the n 
independent events whose probabilities are in row or column vector P 
(k may be a row or column vector of nonnegative integers less than or equal to n). 

Code 



function y = ikn(P,k) 

7, IKN y = ikn(P,k) Individual probabilities of k of n successes 

'/„ Version of 5/15/95 

7, Uses the m-functions mintable, minprob, csort 

n = length(P) ; 

T = sum(mintable(n)) ; '/, The number of successes in each minterm 

pm = minprob (P); '/, The probability of each minterm 

[t,p] = csort(T,pm); '/, Sorts and consolidates success numbers 

'/, and adds corresponding probabilities 
y = p(k+l); 
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ckn.m 17.26 

Description of Code 

ckn.mf unction y = ckn(P,k) determines the probability of the occurrence of k or more of the n 
independent events whose probabilities are in row or column vector P (k may be a row or column 
vector) 

Code 



function y = ckn(P,k) 

'/, CKN y = ckn(P,k) Probability of k or more successes 

'/. Version of 5/15/95 

7, Probabilities of k or more of n independent events 

'/, Uses the m-f unctions mintable, minprob, csort 

n = length(P) ; 

m = length(k) ; 

T = sum(mintable(n)) ; '/, The number of successes in each minterm 

pm = minprob (P); '/, The probability of each minterm 

[t,p] = csort(T,pm); '/, Sorts and consolidates success numbers 

'/, and adds corresponding probabilities 
for i = l:m '/, Sums probabilities for each k value 

y(i) = sum(p(k(i)+l:n+l)); 
end 



parallel.m 17.27 

Description of Code 

parallel. mf unction y = parallel (p) determines the probability of a parallel combination of 
the independent events whose probabilities are in row or column vector p. 

Code 



function y = parallel (p) 

7, PARALLEL y = parallel (p) Probaaability of parallel combination 

'/. Version of 3/3/93 

'/, Probability of parallel combination. 

'/, Individual probabilities in row matrix p. 

y = 1 - prod(l - p) ; 
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17.1.5 Conditional probability and conditional idependence 

bayes.m 17.28 

Description of Code 

bayes.m produces a Bayesian reversal of conditional probabilities. The input consists of P (E\Ai) 
and P (Ai) for a disjoint class {Ai : 1 < i < n} whose union contains E. The procedure calculates 
P(Ai\E) and P {A Z \E C ) for 1 < i < n. 

Code 



7. BAYES file bayes.m Bayesian reversal of conditional probabilities 

'/. Version of 7/6/93 

'/. Input P(E|Ai) and P(Ai) 

'/. Calculates P(Ai|E) and P(Ai|Ec) 

dispC Requires input PEA = [P(E|A1) P(E|A2) ... P(E|An)]') 

dispC and PA = [P(A1) P(A2) ... P(An)]') 

dispC Determines PAE = [P(A1|E) P(A2|E) ... P(An|E)]') 

dispC and PAEc = [P(Al|Ec) P(A2|Ec) ... P(An|Ec)]') 

PEA = input('Enter matrix PEA of conditional probabilities '); 

PA = input('Enter matrix PA of probabilities '); 

PE = PEA*PA'; 

PAE = (PEA.*PA)/PE; 

PAEc = ((1 - PEA).*PA)/(1 - PE); 

dispC ') 

disp(['P(E) = ',num2str(PE),]) 

dispC ') 

dispC P(E|Ai) P(Ai) P(Ai|E) P(Ai|Ec)') 

disp([PEA; PA; PAE; PAEc]') 

disp(' Various quantities are in the matrices PEA, PA, PAE, PAEc, named above') 



odds.m 17.29 

Description of Code 

odds.m The procedure calculates posterior odds for for a specified profile E. Assumes data have 
been entered by the procedure oddsf or oddsp. 

Code 



7. ODDS file odds.m Posterior odds for profile 

7. Version of 12/4/93 

7. Calculates posterior odds for profile E 

7. Assumes data has been entered by oddsdf or oddsdp 

E = input('Enter profile matrix E '); 

C = diag(a( : ,E) ) ' ; 7. aa = a(:,E) is an n by n matrix whose ith column 

D = diag(b( : ,E) ) ' ; 7. is the E(i)th column of a. The elements on the 

7. diagonal are b(i, E(i)), 1 <= i <= n 

7. Similarly for b(: ,E) 
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R = prod(C. /D) *(pl/p2) ; '/, Calculates posterior odds for profile 

dispC ') 

disp(['0dds favoring Group 1: ' ,num2str (R) ,] ) 

if R > 1 

disp( 'Classify in Group 1') 
else 

disp( 'Classify in Group 2') 
end 



oddsdf.m 17.30 
Description of Code 
oddsdf.m Sets up calibrating frequencies for calculating posterior odds. 

Code 



°/. ODDSDF file oddsdf.m Frequencies for calculating odds 

'/. Version of 12/4/93 

°/. Sets up calibrating frequencies 

'/, for calculating posterior odds 

A = input('Enter matrix A of frequencies for calibration group 1 '); 

B = input('Enter matrix B of frequencies for calibration group 2 '); 

n = length(A( : , 1) ) ; '/, Number of questions (rows of A) 

m = length(A(l , : ) ) ; '/, Number of answers to each question 

pi = sum(A(l , : )) ; '/, Number in calibration group 1 

p2 = sum(B(l , : )) ; '/, Number in calibration group 2 

a = A/pl; 

b = B/p2; 

disp(' ') '/, Blank line in presentation 

disp( ['Number of questions = ' ,num2str(n) ,] ) '/, Size of profile 

disp( ['Answers per question = ' ,num2str (m) ,] ) '/, Usually 3: yes, no, uncertain 

disp(' Enter code for answers and call for procedure "odds" ') 

dispC ') 



oddsdp.m 17.31 
Description of Code 
oddsdp.m Sets up conditional probabilities for odds calculations. 

Code 



'/, ODDSDP file oddsdp.m Conditional probs for calculating posterior odds 
'/. Version of 12/4/93 
'/, Sets up conditional probabilities 
'/, for odds calculations 

a = input('Enter matrix A of conditional probabilities for Group 1 '); 
b = input('Enter matrix B of conditional probabilities for Group 2 '); 
pi = input ( 'Probability pi an individual is from Group 1 '); 
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n = length(a(: ,1)) ; 

m = length(a(l, :)) ; 

p2 = 1 - pi; 

disp(' ') '/, Blank line in presentation 

disp( ['Number of questions = ' ,num2str(n) ,] ) °/„ Size of profile 

disp( ['Answers per question = ' ,num2str (m) ,] ) '/, Usually 3: yes, no, uncertain 

disp(' Enter code for answers and call for procedure "odds" ') 

dispC ') 



17.1.6 Bernoulli and multinomial trials 

btdata.m 17.32 

Description of Code 

btdata.m Sets parameter p and number n of trials for generating Bernoulli sequences. Prompts 
for bt to generate the trials. 

Code 



7, BTDATA file btdata.m Parameters for Bernoulli trials 
'/. Version of 11/28/92 

'/, Sets parameters for generating Bernoulli trials 
'/, Prompts for bt to generate the trials 
n = input('Enter n, the number of trials '); 

p = input ('Enter p, the probability of success on each trial '); 
dispC ') 

dispC Call for bt') 
dispC ') 



bt.m 17.33 



Description of Code 

bt.m Generates Bernoulli sequence for parameters set by btdata. Calculates relative frequency of 



successes. 
Code 



'/, BT file bt.m Generates Bernoulli sequence 

'/. version of 8/11/95 Revised 7/31/97 for version 4.2 and 5.1, 5.2 

7, Generates Bernoulli sequence for parameters set by btdata 

'/, Calculates relative frequency of 'successes' 

clear SEQ; 

B = rand(n,l) <= p; % ones for random numbers <= p 

F = sum(B)/n; 7, relative frequency of ones 
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N = [l:n]'; '/. display details 

disp(['n = ' ,num2str (n) , ' p = ' ,num2str (p) ,] ) 
disp( ['Relative frequency = ' ,num2str (F) ,] ) 
SEQ = [N B] ; 
clear N; 
clear B; 

disp('To view the sequence, call for SEQ') 
dispC ') 



binomial, m 17.34 

Description of Code 

binomial. m Uses ibinom and cbinom to generate tables of the individual and cumulative binomial 
probabilities for specified parameters. Note that for calculation in MATLAB it is usually much 
more convenient and efficient to use ibinom and/or cbinom. 

Code 



°/. BINOMIAL file binomial. m Generates binomial tables 

'/. Version of 12/10/92 (Display modified 4/28/96) 

'/, Calculates a TABLE of binomial probabilities 

'/, for specified n, p, and row vector k, 

°/. Uses the m-f unctions ibinom and cbinom. 

n = input('Enter n, the number of trials '); 

p = input ('Enter p, the probability of success '); 

k = input ('Enter k, a row vector of success numbers '); 

y = ibinom(n,p,k) ; 

z = cbinom(n,p,k) ; 

disp([' n = ' , int2str(n) , ' p = ' num2str(p)]) 

H = [' k P(X = k) P(X >= k)']; 

disp(H) 

disp([k;y;z] ') 



multinom.m 17.35 
Description of Code 
multinom.m Multinomial distribution (small N,m). 

Code 



'/. MULTINOM file multinom.m Multinomial distribution 

'/. Version of 8/24/96 

7, Multinomial distribution (small N, m) 

N = input ('Enter the number of trials '); 

m = input ('Enter the number of types '); 

p = input('Enter the type probabilities '); 
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M = l:m; 

T = zeros (nTN.N) ; 

for i = 1:N 

a = rowcopy (M,m~ (i-1) ) ; 

a = a(:) ; 

a = colcopy (a,m~ (N-i) ) ; 

T(:,N-i+l) = a(:); '/, All possible strings of the types 
end 

MT = zeros (m~N,m) ; 
for i = l:m 

MT(:,i) = sum(T'==i)'; 
end 

clear T '/, To conserve memory 

disp('String frequencies for type k are in column matrix MT(:,k)') 
P = zeros (m~N,N) ; 
for i = 1:N 

a = rowcopy (p,m~ (i-1) ) ; 

a = a(:) ; 

a = colcopy (a, m" (N-i) ) ; 

P(:,N-i+l) = a(:); '/, Strings of type probabilities 
end 

PS = prod(P'); % Probability of each string 

clear P '/, To conserve memory 

disp(' String probabilities are in row matrix PS') 



17.1.7 Some matching problems 

Cardmatch.m 17.36 

Description of Code 

Cardmatch.m Sampling to estimate the probability of one or more matches when one card is 
drawn from each of nd identical decks of c cards. The number ns of samples is specified. 

Code 



'/, CARDMATCH file cardmatch.m Prob of matches in cards from identical decks 

'/. Version of 6/27/97 

'/, Estimates the probability of one or more matches 

7, in drawing cards from nd decks of c cards each 

'/, Produces a supersample of size n = nd*ns, where 

'/, ns is the number of samples 

'/, Each sample is sorted, and then tested for differences 

'/, between adjacent elements. Matches are indicated by 

°/ zero differences between adjacent elements in sorted sample 

c = input ('Enter the number c of cards in a deck '); 

nd = input ('Enter the number nd of decks '); 
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ns = input ('Enter the number ns of sample runs '); 

X = l:c; '/, Population values 

PX = (l/c)*ones(l , c) ; '/, Population probabilities 

N = nd*ns; '/, Length of supersample 

U = rand(l,N); '/, Matrix of n random numbers 

T = dquant (X,PX,U) ; 7. Supersample obtained with quantile function; 

7. the function dquant determines quantile 
'/, function values of random number sequence U 

ex = sum(T)/N; '/, Sample average 

EX = dot(X.PX); 7. Population mean 

vx = sum(T.~2)/N - ex~2; '/, Sample variance 

VX = dot(X.~2,PX) - EX~2; 7. Population variance 

A = reshape(T,nd,ns) ; '/, Chops supersample into ns samples of size nd 

DS = dif f (sort (A) ) ; '/, Sorts each sample 

m = sum(DS==0) >0; '/, Differences between elements in each sample 

'/, Zero difference iff there is a match 

pm = sum(m)/ns; '/, Fraction of samples with one or more matches 

Pm = 1 - comb(c,nd)*gamma(nd + l)/c~(nd); '/, Theoretical probability of match 

disp('The sample is in column vector T') '/, Displays of results 

disp( ['Sample average ex = ' , num2str (ex) ,] ) 

disp( ['Population mean E(X) = ' ,num2str (EX) ,] ) 

disp( ['Sample variance vx = ' ,num2str (vx) ,] ) 

disp( ['Population variance V(X) = ' ,num2str(VX) ,] ) 

disp( ['Fraction of samples with one or more matches pm = ' , num2str (pm) ,] ) 

disp( ['Probability of one or more matches in a sample Pm = ' , num2str (Pm) ,] ) 



trialmatch.m 17.37 

Description of Code 

trialmatch.m Estimates the probability of matches in n independent trials from identical distri- 
butions. The sample size and number of trials must be kept relateively small to avoid exceeding 
available memory. 

Code 



7. TRIALMATCH file trialmatch.m Estimates probability of matches 

7. in n independent trials from identical distributions 

'/. Version of 8/20/97 

'/, Estimates the probability of one or more matches 

'I, in a random selection from n identical distributions 

7. with a small number of possible values 

7. Produces a supersample of size N = n*ns, where 

'/, ns is the number of samples. Samples are separated. 

'/, Each sample is sorted, and then tested for differences 

'/, between adjacent elements. Matches are indicated by 

7. zero differences between adjacent elements in sorted sample. 

X = input('Enter the VALUES in the distribution '); 

PX = input ('Enter the PROBABILITIES '); 

c = length (X) ; 
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n = 
ns = 

N = 
U = 
T = 



ex = 
EX = 
vx = 
VX = 
A = 
DS = 
m = 

pm = 
d = 
P = 
P = 
ds = 
mm = 
mO = 
pmO 
PO = 
disp 
disp 
disp 
disp 
disp 
disp 
disp 



input ('Enter the SAMPLE SIZE n '); 

input ('Enter the number ns of sample runs 



'); 



n*ns; 

rand(l,N) ; 
dquant(X,PX,U); 



sum(T)/N; 
dot(X,PX); 
sum(T.~2)/N - ex~2; 
dot(X.~2,PX) - EX~2; 
reshape(T,n,ns) ; 
diff (sort (A)); 
sum(DS==0)>0; 



7. Length of supersample 
7. Vector of N random numbers 

'/, Supersample obtained with quantile function; 
'/, the function dquant determines quantile 
7. function values for random number sequence U 
'/, Sample average 
7. Population mean 
7. Sample variance 
7. Population variance 

7. Chops supersample into ns samples of size n 
'/, Sorts each sample 

7, Differences between elements in each sample 
7. -- Zero difference iff there is a match 
7. Fraction of samples with one or more matches 



sum(m)/ns; 

arrep(c,n) ; 

PX(d); 

reshape (p , size (d) ) ; 

diff (sort (d))==0; 

sum(ds) >0; 

f ind(l-mm) ; 
= p(: ,m0) ; 

sum (prod (pmO ) ) ; 

('The sample is in column vector T') '/, Displays of results 
(['Sample average ex = ' , num2str (ex) ,] ) 
(['Population mean E(X) = ' ,num2str (EX) ,] ) 
(['Sample variance vx = ' ,num2str (vx) ,] ) 
(['Population variance V(X) = ' ,num2str(VX) ,] ) 

(['Fraction of samples with one or more matches pm = ' , num2str (pm) ,] ) 
(['Probability of one or more matches in a sample Pm = ' , num2str (1-PO) ,] ) 



7. This step not needed in version 5.1 



7. Probabilities for arrangements with no matches 



17.1.8 Distributions 

comb.m 17.38 

Description of Code 

comb.mfunction y = comb(n,k) Calculates binomial coefficients, k may be a matrix of integers 
between and n. The result y is a matrix of the same dimensions. 

Code 



function y = comb(n,k) 

7, COMB y = comb(n,k) Binomial coefficients 

7. Version of 12/10/92 

7. Computes binomial coefficients C(n,k) 

7. k may be a matrix of integers between and n 

7. result y is a matrix of the same dimensions 
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y = round (gamma(n+l) . /(gamma (k + l).*gamma(n + 1 - k) ) ) ; 



ibinom.m 17.39 

Description of Code 

ibinom.m Binomial distribution — individual terms. We have two m-functions ibinom and cbinom 
for calculating individual and cumulative terms, P (S n = k) and P (S n > k), respectively. 

n 

P(S n = k) = C{n,k)p k (l-p) n ~ k and P (S n > k) = ^ P (S n = r) 0<k<n (17.1) 

r— k 

For these m-functions, we use a modification of a computation strategy employed by S. Weintraub: 
Tables of the Cumulative Binomial Probability Distribution for Small Values of 
p, 1963. The book contains a particularly helpful error analysis, written by Leo J. Cohen. Exper- 
imentation with sums and expectations indicates a precision for ibinom and cbinom calculations 
that is better than 10 -10 for n = 1000 and p from 0.01 to 0.99. A similar precision holds for values 
of n up to 5000, provided np or nq are limited to approximately 500. Above this value for np or 
nq, the computations break down. For individual terms, function y = ibinom(n,p,k) calculates 
the probabilities for n a positive integer, k a matrix of integers between and n. The output is a 
matrix of the corresponding binomial probabilities. 

Code 



function y = ibinom(n,p,k) 

°/. IBINOM y = ibinom(n,p,k) Individual binomial probabilities 

'/. Version of 10/5/93 

°/. n is a positive integer; p is a probability 

'/, k a matrix of integers between and n 

'/, y = P(X>=k) (a matrix of probabilities) 

if p > 0.5 

a = [1 ((l-p)/p)*ones(l,n)] ; 

b = [1 n:-l:l] ; 

c = [1 l:n]; 

br = (p~n)*cumprod(a. *b./c) ; 

bi = fliplr(br); 

else 

a = [1 (p/(l-p))*ones(l,n)] ; 

b = [1 n:-l:l] ; 

c = [1 l:n]; 

bi = ( (1-p) ~n) *cumprod(a. *b. /c) ; 

end 

y = bi(k+l); 



ipoisson.m 17.40 

Description of Code 

ipoisson.m Poisson distribution — individual terms. As in the case of the binomial distribution, 
we have an m-function for the individual terms and one for the cumulative case. The m-functions 
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ipoisson and cpoisson use a computational strategy similar to that used for the binomial case. 
Not only does this work for large /x, but the precision is at least as good as that for the binomial 
m-functions. Experience indicates that the m-functions are good for \i < 700. They breaks down 
at about 710, largely because of limitations of the MATLAB exponential function. For individual 
terms, function y = ipoisson (mu,k) calculates the probabilities for mu a positive integer, k a 
row or column vector of nonnegative integers. The output is a row vector of the corresponding 
Poisson probabilities. 

Code 



function y = ipoisson (mu,k) 

7, IPOISSON y = ipoisson (mu,k) Individual Poisson probabilities 

'/. Version of 10/15/93 

'/, mu = mean value 

'/, k may be a row or column vector of integer values 

7, y = P(X = k) (a row vector of probabilities) 

K = max(k) ; 

p = exp(-mu)*cumprod( [1 mu*ones(l,K)] ./ [1 1:K]); 

y = p(k+l); 



cpoisson. m 17.41 
Description of Code 

cpoisson. m Poisson distribution — cumulative terms, function y = cpoisson (mu, k) , calculates 
P (X > k), where k may be a row or a column vector of nonnegative integers. The output is a row 
vector of the corresponding probabilities. 

Code 



function y = cpoisson (mu,k) 

°/» CPOISSON y = cpoisson (mu,k) Cumulative Poisson probabilities 

'/. Version of 10/15/93 

'/. mu = mean value mu 

'/, k may be a row or column vector of integer values 

'/, y = P(X >= k) (a row vector of probabilities) 

K = max(k) ; 

p = exp(-mu) *cumprod( [1 mu*ones(l,K)] ./ [1 1:K]); 

pc = [1 1 - cumsum(p)] ; 

y = pc(k+l); 



nbinom.m 17.42 
Description of Code 

nbinom.m Negative binomial — function y = nbinom(m, p, k) calculates the probability that 
the mth success in a Bernoulli sequence occurs on the irth trial. 

Code 
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function y = nbinom(m, p, k) 

7, NBINOM y = nbinom(m, p, k) Negative binomial probabilities 

'/. Version of 12/10/92 

'/. Probability the mth success occurs on the kth trial 

'/. m a positive integer; p a probability 

7, k a matrix of integers greater than or equal to m 

'/, y = P(X=k) (a matrix of the same dimensions as k) 

q = 1 - p; 

y = ((p~m) /gamma (m)) . * (q. " (k - m) ). *gamma(k) . /gamma (k - m + 1) ; 



gaussian.m 17.43 

Description of Code 

gaussian.mf unction y = gaussian(m, v, t) calculates the Gaussian (Normal) distribution 
function for mean value m, variance v, and matrix t of values. The result y = P (X < t) is a 
matrix of the same dimensions as t. 

Code 



function y = gaussian(m,v,t) 

7, GAUSSIAN y = gaussian(m, v,t) Gaussian distribution function 

'/. Version of 11/18/92 

'/, Distribution function for X ~ N(m, v) 

'/, m = mean, v = variance 

'/, t is a matrix of evaluation points 

'/, y = P(X<=t) (a matrix of the same dimensions as t) 

u = (t - m) ./sqrt(2*v) ; 

if u >= 

y = 0.5* (erf (u) + 1) ; 
else 

y = . 5*erf c(-u) ; 
end 



gaussdensity.m 17.44 

Description of Code 

gaussdensity.mf unction y = gaussdensity (m, v,t) calculates the Gaussian density function 
fx (t) for mean value m, variance t, and matrix t of values. 

Code 



function y = gaussdensity (m,v,t) 

'/. GAUSSDENSITY y = gaussdensity (m,v,t) Gaussian density 

'/. Version of 2/8/96 

7, m = mean, v = variance 
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'/, t is a matrix of evaluation points 

y = exp(-((t-m) . ~2)/(2*v) )/sqrt (v*2*pi) ; 



norminv.m 17.45 

Description of Code 

norminv.mf unction y = norminv(m,v,p) calculates the inverse (the quantile function) of the 
Gaussian distribution function for mean value m, variance v, and p a matrix of probabilities. 

Code 



function y = norminv(m, v,p) 

°/. NORMINV y = norminv(m,v,p) Inverse gaussian distribution 

'/, (quantile function for gaussian) 

'/. Version of 8/17/94 

'/, m = mean, v = variance 

'/, t is a matrix of evaluation points 

if p >= 

u = sqrt(2)*erf inv(2*p - 1); 
else 

u = -sqrt(2)*erf inv(l - 2*p) ; 
end 
y = sqrt(v)*u + m; 



gammadbn.m 17.46 

Description of Code 

gammadbn.mfunction y = gammadbn( alpha, lambda, t) calculates the distribution function 
for a gamma distribution with parameters alpha, lambda, £ is a matrix of evaluation points. The 
result is a matrix of the same size. 

Code 



function y = gammadbn( alpha, lambda, t) 

'/. GAMMADBN y = gammadbn (alpha, lambda, t) Gamma distribution 

'/. Version of 12/10/92 

'/, Distribution function for X ~ gamma (alpha, lambda) 

'/, alpha, lambda are positive parameters 

'/, t may be a matrix of positive numbers 

'/, y = P(X<= t) (a matrix of the same dimensions as t) 

y = gammainc(lambda*t , alpha); 



beta.m 17.47 
Description of Code 

beta.mfunction y = beta(r,s,t) calculates the density function for the beta distribution with 
parameters r, s. £ is a matrix of numbers between zero and one. The result is a matrix of the same 
size. 
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Code 



function y = beta(r,s,t) 

7, BETA y = beta(r,s,t) Beta density function 

'/. Version of 8/5/93 

'/, Density function for Beta (r,s) distribution 

7, t is a matrix of evaluation points between and 1 

'/, y is a matrix of the same dimensions as t 

y = (gamma(r+s)/(gamma(r) *gamma(s) ) ) * (t . " (r-1) . *(l-t) . " (s-1) ) ; 



betadbn.m 17.48 

Description of Code 

betadbn.mfunction y = betadbn(r , s,t) calculates the distribution function for the beta dis- 
tribution with parameters r, s. t is a matrix of evaluation points. The result is a matrix of the 
same size. 

Code 



function y = betadbn(r , s,t) 

7, BETADBN y = betadbn(r ,s,t) Beta distribution function 

'/. Version of 7/27/93 

'/, Distribution function for X beta(r,s) 

'/, y = P(X<=t) (a matrix of the same dimensions as t) 

y = betainc(t ,r, s) ; 



weibull.m 17.49 

Description of Code 

weibull.mf unction y = weibull (alpha, lambda, t) calculates the density function for the 
Weibull distribution with parameters alpha, lambda. £ is a matrix of evaluation points. The 
result is a matrix of the same size. 

Code 



function y = weibull (alpha, lambda, t) 

'/, WEIBULL y = weibull (alpha, lambda, t) Weibull density 

'/. Version of 1/24/91 

'/, Density function for X ~ Weibull (alpha, lambda, 0) 

'/, t is a matrix of positive evaluation points 

'/, y is a matrix of the same dimensions as t 

y = alpha+lambda* (t . " (alpha - 1) ). *exp(-lambda* (t . "alpha)) ; 
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weibulld.m 17.50 

Description of Code 

weibulld.mfunction y = we ibulld (alpha, lambda, t) calculates the distribution function for 
the Weibull distribution with parameters alpha, lambda, t is a matrix of evaluation points. The 
result is a matrix of the same size. 

Code 



function y = weibulld (alpha, lambda, t) 

'/, WEIBULLD y = weibulld(alpha, lambda, t) Weibull distribution function 

'/. Version of 1/24/91 

'/, Distribution function for X ~ Weibull (alpha, lambda, 0) 

'/, t is a matrix of positive evaluation points 

'/, y = P(X<=t) (a matrix of the same dimensions as t) 

y = 1 - exp (-lambda* (t . "alpha) ) ; 



17.1.9 Binomial, Poisson, and Gaussian distributions 

bincomp.m 17.51 

Description of Code 

bincomp.m Graphical comparison of the binomial, Poisson, and Gaussian distributions. The 
procedure calls for binomial parameters n,p, determines a reasonable range of evaluation points 
and plots on the same graph the binomial distribution function, the Poisson distribution function, 
and the gaussian distribution function with the adjustment called the "continuity correction." 

Code 



'/, BINCOMP file bincomp.m Approx of binomial by Poisson and gaussian 

'/. Version of 5/24/96 

7, Gaussian adjusted for "continuity correction" 

'/. Plots distribution functions for specified parameters n, p 

n = input ('Enter the parameter n '); 

p = input ('Enter the parameter p '); 

a = floor (n*p-2*sqrt (n*p) ) ; 

a = max(a,l); '/, Prevents zero or negative indices 

b = floor (n*p+2*sqrt (n*p) ) ; 

k = a:b; 

Fb = cumsum(ibinom(n,p,0:n) ) ; '/, Binomial distribution function 

Fp = cumsum(ipoisson(n*p,0 :n)) ; '/, Poisson distribution function 

Fg = gaussian (n*p, n*p* (1 - p),k+0.5); 7, Gaussian distribution function 

stairs (k,Fb(k+l)) '/. Plotting details 

hold on 

plot(k,Fp(k+l),'-.',k,Fg,'o') 
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hold off 

xlabel('t values') '/, Graph labeling details 

ylabel( 'Distribution function') 

title ( 'Approximation of Binomial by Poisson and Gaussian') 

grid 

legend ( 'Binomial' , 'Poisson' , 'Adjusted Gaussian') 

disp('See Figure for results') 



poissapp.m 17.52 

Description of Code 

poissapp.m Graphical comparison of the Poisson and Gaussian distributions. The procedure 
calls for a value of the Poisson parameter mu, then calculates and plots the Poisson distribution 
function, the Gaussian distribution function, and the adjusted Gaussian distribution function. 

Code 



7, POISSAPP file poissapp.m Comparison of Poisson and gaussian 

'/„ Version of 5/24/96 

'/. Plots distribution functions for specified parameter mu 

mu = input ('Enter the parameter mu '); 

n = f loor (1 . 5*mu) ; 

k = floor (mu-2*sqrt (mu) ) :f loor (mu+2*sqrt (mu) ) ; 

FP = cumsum(ipoisson(mu,0 :n) ) ; 

FG = gaussian (mu, mu, k) ; 

FC = gaussian (mu,mu,k-0 .5) ; 

stairs (k,FP(k)) 

hold on 

plot(k,FG,'-. ',k,FC,'o') 

hold off 

grid 

xlabel('t values') 

ylabel( 'Distribution function') 

title ( 'Gaussian Approximation to Poisson Distribution') 

legend( 'Poisson' , 'Gaussian' , 'Adjusted Gaussian') 

disp('See Figure for results') 



17.1.10 Setup for simple random variables 

If a simple random variable X is in canonical form, the distribution consists of the coefficients of the indicator 
funtions (the values of X) and the probabilities of the corresponding events. If X is in a primitive form other 
than canonical, the csort operation is applied to the coefficients of the indicator functions and the probabilities 
of the corresponding events to obtain the distribution. If Z = g (X) and X is in a primitive form, then the 
value of Z on the event in the partition associated with t; is g(U). The distribution for Z is obtained by 
applying csort to the g (ij) and the p;. Similarly, if Z = g (X, Y) and the joint distribution is available, the 
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value g(ti,Uj) is associated with P (X = U,Y = Uj) . The distribution for Z is obtained by applying csort 
to the matrix of values and the corresponding matrix of probabilities. 

canonic, m 17.53 

Description of Code 

canonic. m The procedure determines the distribution for a simple random variable in affine 
form, when the minterm probabilities are available. Input data are a row vector of coefficients for 
the indicator functions in the affine form (with the constant value last) and a row vector of the 
probabilities of the minterm generated by the events. Results consist of a row vector of values and 
a row vector of the corresponding probabilities. 

Code 



'/, CANONIC file canonic. m Distribution for simple rv in affine form 

'/. Version of 6/12/95 

'/, Determines the distribution for a simple random variable 

'/, in affine form, when the minterm probabilities are available. 

'/, Uses the m-f unctions mintable and csort. 

'/, The coefficient vector must contain the constant term. 

'/, If the constant term is zero, enter in the last place. 

c = input (' Enter row vector of coefficients '); 

pm = input (' Enter row vector of minterm probabilities '); 

n = length(c) - 1; 

if 2~n ~= length (pm) 

error ( 'Incorrect minterm probability vector length'); 
end 

M = mintable (n); '/, Provides a table of minterm patterns 

s = c(l:n)*M + c(n+l); '/, Evaluates X on each minterm 

[X,PX] = csort(s,pm); '/, s = values; pm = minterm probabilities 

XDBN = [X;PX] '; 

disp('Use row matrices X and PX for calculations') 
dispCCall for XDBN to view the distribution') 



canonicf.m 17.54 

Description of Code 

canonicf.mf unction [x,px] = canonicf (c,pm) is a function version of canonic, which allows 
arbitrary naming of variables. 

Code 



function [x,px] = canonicf (c,pm) 

'/, CANONICF [x,px] = canonicf (c,pm) Function version of canonic 

'/. Version of 6/12/95 

'/, Allows arbitrary naming of variables 

n = length (c) - 1; 

if 2~n ~= length (pm) 

error ( 'Incorrect minterm probability vector length'); 
end 
M = mintable (n); '/, Provides a table of minterm patterns 
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s = c(l:n)*M + c(n+l); '/, Evaluates X on each minterm 

[x,px] = csort (s,pm) ; '/, s = values; pm = minterm probabilities 



jcalc.m 17.55 
Description of Code 

jcalc.m Sets up for calculations for joint simple random variables. The matrix P of 
P (X = ti,Y = Uj) is arranged as on the plane (i.e., values of Y increase upward). The MAT- 
LAB function meshgrid is applied to the row matrix X and the reversed row matrix for Y to put 
an appropriate X-value and Y-value at each position. These are in the "calculating matrices" t and 
u, respectively, which are used in determining probabilities and expectations of various functions 
of t, u. 

Code 



5i JCALC file jcalc.m Calculation setup for joint simple rv 

'/, Version of 4/7/95 (Update of prompt and display 5/1/95) 

'/, Setup for calculations for joint simple random variables 

'/, The joint probabilities arranged as on the plane 

'/, (top row corresponds to largest value of Y) 

P = input ('Enter JOINT PROBABILITIES (as on the plane) '); 

X = input ('Enter row matrix of VALUES of X '); 

Y = input ('Enter row matrix of VALUES of Y '); 

PX = sum(P) ; '/. probabilities for X 

PY = fliplr(sum(P')) ; '/. probabilities for Y 

[t,u] = meshgrid (X,f liplr (Y) ) ; 

disp(' Use array operations on matrices X, Y, PX, PY, t, u, and P') 



jcalcf.m 17.56 

Description of Code 

jcalcf.mfunction [x,y ,t ,u,px,py ,p] = jcalcf (X,Y,P) is a function version of jcalc, which 
allows arbitrary naming of variables. 

Code 



function [x,y ,t ,u,px,py ,p] = jcalcf (X,Y,P) 

'/, JCALCF [x,y ,t ,u,px,py ,p] = jcalcf (X,Y,P) Function version of jcalc 

'/. Version of 5/3/95 

'/, Allows arbitrary naming of variables 

if sum(size(P) ~= [length (Y) length (X)]) > 

error (' Incompatible vector sizes') 
end 
x = X; 
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y = Y; 

p = P; 

px = sum(P) ; 

py = f liplr (sum(P' )) ; 

[t,u] = meshgrid (X,f liplr (Y)); 



jointzw.m 17.57 

Description of Code 

jointzw.m Sets up joint distribution for Z = g (X, Y) and W = h (X, Y) and provides calculating 
matrices as in jcalc. Inputs are P, X, and Y as well as array expressions for g(t,u) and h(t,u). 
Outputs are matrices Z,W,PZW for the joint distribution, marginal probabilities PZ,PW, and 
the calculating matrices v,w. 

Code 



'/. JOINTZW file jointzw.m Joint dbn for two functions of (X,Y) 

'/„ Version of 4/29/97 

°/. Obtains joint distribution for 

'/. Z = g(X,Y) and W = h(X,Y) 

7, Inputs P, X, and Y as well as array 

'/, expressions for g(t,u) and h(t,u) 

P = input('Enter joint prob for (X,Y) '); 

X = input ('Enter values for X '); 

Y = input ('Enter values for Y '); 

[t,u] = meshgrid (X,f liplr (Y)); 

G = input ('Enter expression for g(t,u) '); 

H = input ('Enter expression for h(t,u) '); 

[Z,PZ] = csort(G,P); 

[W,PW] = csort(H,P); 

r = length (W) ; 

c = length(Z) ; 

PZW = zeros (r, c) ; 

for i = l:r 

for j = 1 : c 
a = find((G==Z(j))&(H==W(i))); 
if ~isempty(a) 

PZW(i.j) = total (P (a)); 
end 

end 
end 

PZW = flipud(PZW); 
[v,w] = meshgrid (Z,f liplr (W) ) ; 
if (G==t)&(H==u) 

dispC ') 

dispC Note: Z = X and W = Y') 

dispC ') 
elseif G==t 

dispC ') 



562 CHAPTER 17. APPENDICES 

dispC Note: Z = X') 

dispC ') 
elseif H==u 

dispC ') 

dispC Note: W = Y') 

dispC ') 
end 
disp('Use array operations on Z, W, PZ, PW, v, w, PZW) 



jdtest.m 17.58 
Description of Code 
jdtest.m Tests a joint probability matrix P for negative entries and unit total probability.. 

Code 



function y = jdtest(P) 

7, JDTEST y = jdtest(P) Tests P for unit total and negative elements 

°/„ Version of 10/8/93 

M = min(min(P)) ; 

S = sum(sum(P) ) ; 

if M < 

y = 'Negative entries'; 
elseif abs(l - S) > le-7 

y = 'Probabilities do not sum to one'; 
else 

y = 'P is a valid distribution'; 
end 



17.1.11 Setup for general random variables 

tappr.m 17.59 

Description of Code 

tappr.m Uses the density function to set up a discrete approximation to the distribution for 
absolutely continuous random variable X. 

Code 



°/. TAPPR file tappr.m Discrete approximation to ac random variable 

'/. Version of 4/16/94 

°/. Sets up discrete approximation to distribution for 

'/, absolutely continuous random variable X 

7, Density is entered as a function of t 

r = input('Enter matrix [a b] of x-range endpoints '); 

n = input('Enter number of x approximation points '); 
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d = (r(2) - r(l))/n; 

t = (r(l) :d:r(2)-d) +d/2; 

PX = input('Enter density as a function of t '); 

PX = PX*d; 

PX = PX/sum(PX); 

X = t; 

disp('Use row matrices X and PX as in the simple case') 



tuappr.m 17.60 
Description of Code 
tuappr.m Uses the joint density to set up discrete approximations to X, Y, t, u, and density. 

Code 



7, TUAPPR file tuappr.m Discrete approximation to joint ac pair 

'/. Version of 2/20/96 

7, Joint density entered as a function of t , u 

'/, Sets up discrete approximations to X, Y, t, u, and density 

rx = input('Enter matrix [a b] of X-range endpoints '); 

ry = input('Enter matrix [c d] of Y-range endpoints '); 

nx = input ('Enter number of X approximation points '); 

ny = input ('Enter number of Y approximation points '); 

dx = (rx(2) - rx(l))/nx; 

dy = (ry(2) - ry(l))/ny; 

X = (rx(l) :dx:rx(2)-dx) + dx/2; 

Y = (ry(l):dy:ry(2)-dy) + dy/2; 

[t,u] = meshgrid(X,fliplr(Y)); 

P = input('Enter expression for joint density '); 

P = dx*dy*P; 

P = P/sum(sum(P)) ; 

PX = sum(P) ; 

PY = fliplr(sum(P')); 

disp('Use array operations on X, Y, PX, PY, t, u, and P') 



dfappr.m 17.61 
Description of Code 
dfappr.m Approximate discrete distribution from distribution function entered as a function of t. 

Code 



7, DFAPPR file dfappr.m Discrete approximation from distribution function 
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7, Version of 10/21/95 

7, Approximate discrete distribution from distribution 

7, function entered as a function of t 

r = input('Enter matrix [a b] of X-range endpoints '); 

s = input('Enter number of X approximation points '); 

d = (r(2) - r(l))/s; 

t = (r(l) :d:r(2)-d) +d/2; 

m = length (t) ; 

f = input('Enter distribution function F as function of t '); 

f = [0 f]; 

PX = f (2:m+l) - f (l:m); 

PX = PX/sum(PX); 

X = t - d/2; 

disp(' Distribution is in row matrices X and PX') 



acsetup.m 17.62 

Description of Code 

acsetup.m Approximate distribution for absolutely continuous random variable X. Density is 
entered as a string variable function of t. 

Code 



7, ACSETUP file acsetup.m Discrete approx from density as string variable 

'/. Version of 10/22/94 

°/» Approximate distribution for absolutely continuous rv X 

°/» Density is entered as a string variable function of t 

dispC DENSITY f is entered as a STRING VARIABLE.') 

disp('either defined previously or upon call.') 

r = input('Enter matrix [a b] of x-range endpoints '); 

s = input('Enter number of x approximation points '); 

d = (r(2) - r(l))/s; 

t = (r(l):d:r(2)-d) +d/2; 

m = length (t) ; 

f = input('Enter density as a function of t '); 

PX = eval(f); 

PX = PX*d; 

PX = PX/sum(PX); 

X = t; 

disp(' Distribution is in row matrices X and PX') 



dfsetup.m 17.63 

Description of Code 

dfsetup.m Approximate discrete distribution from distribution function entered as a string vari- 
able function of t. 

Code 
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'/, DFSETUP file df setup. m Discrete approx from string dbn function 

'/. Version of 10/21/95 

7, Approximate discrete distribution from distribution 

7, function entered as string variable function of t 

dispC DISTRIBUTION FUNCTION F is entered as a STRING') 

disp(' VARIABLE, either defined previously or upon call') 

r = input('Enter matrix [a b] of X-range endpoints '); 

s = input('Enter number of X approximation points '); 

d = (r(2) - r(l))/s; 

t = (r(l) :d:r(2)-d) +d/2; 

m = length (t) ; 

F = input('Enter distribution function F as function of t '); 

f = eval(F); 

f = [0 f]; 

PX = f (2:m+l) - f (l:m); 

PX = PX/sum(PX); 

X = t - d/2; 

disp(' Distribution is in row matrices X and PX') 



17.1.12 Setup for independent simple random variables 

MATLAB version 5.1 has provisions for multidimensional arrays, which make possible more direct imple- 
mentation of icalc3 and icalc4. 

icalc.m 17.64 

Description of Code 

icalc.m Calculation setup for an independent pair of simple random variables. Input consists of 
marginal distributions for X, Y, Output is joint distribution and calculating matrices t, u. 

Code 



'/, ICALC file icalc.m Calculation setup for independent pair 

'/. Version of 5/3/95 

'/, Joint calculation setup for independent pair 

X = input ('Enter row matrix of X-values '); 

Y = input ('Enter row matrix of Y-values '); 

PX = input('Enter X probabilities '); 

PY = input('Enter Y probabilities '); 

[a,b] = meshgrid(PX,fliplr(PY)); 

P = a.*b; °/. Matrix of joint independent probabilities 

[t,u] = meshgrid(X,f liplr (Y) ) ; 'At, u matrices for joint calculations 

disp(' Use array operations on matrices X, Y, PX, PY, t, u, and P') 
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icalcf.m 17.65 
Description of Code 
icalcf.m [x , y , t , u , px , py , p] 

arbitrary naming of variables. 

Code 



= icalcf (X,Y,PX,PY) is a function version of icalc, which allows 



function [x,y ,t ,u,px,py ,p] = icalcf (X,Y,PX,PY) 

'/, ICALCF [x,y ,t ,u,px,py ,p] = icalcf (X,Y,PX,PY) Function version of icalc 

'/. Version of 5/3/95 

'/, Allows arbitrary naming of variables 

x = X; 

7 = Y; 

px = PX; 

py = PY; 

if length (X) ~= length (PX) 

error (' X and PX of different lengths') 
elseif length(Y) ~= length(PY) 

error (' Y and PY of different lengths') 
end 

[a,b] = meshgrid(PX,fliplr(PY)); 

p = a.*b; '/, Matrix of joint independent probabilities 

[t,u] = meshgrid(X,f liplr (Y) ) ; '/, t , u matrices for joint calculations 



icalc3.m 17.66 
Description of Code 
icalc3.m Calculation setup for an independent class of three simple random variables. 

Code 



'/, ICALC3 file icalc3.m Setup for three independent rv 

'/. Version of 5/15/96 

'/, Sets up for calculations for three 

"L independent simple random variables 

'/, Uses m-f unctions rep, elrep, kronf 

X = input ('Enter row matrix of X-values ') 

Y = input ('Enter row matrix of Y-values ') 

Z = input ('Enter row matrix of Z-values ') 



PX = input ('Enter X probabilities 

PY = input ('Enter Y probabilities 

PZ = input ('Enter Z probabilities 

n = length (X) : 

m = length (Y) ; 

s = length (Z) : 

[t,u] = meshgrid(X,Y) ; 

t = rep(t,l,s) ; 

u = rep(u,l,s) ; 
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v = elrep(Z,m,n) ; '/, t,u,v matrices for joint calculations 
P = kronf (PZ,kronf(PX,PY')); 

disp('Use array operations on matrices X, Y, Z,') 
disp('PX, PY, PZ, t, u, v, and P') 



icalc4.m 17.67 
Description of Code 
icalc4.m Calculation setup for an independent class of four simple random variables. 

Code 



'/, ICALC4 file icalc4.m Setup for four independent rv 

'/. Version of 5/15/96 

'/, Sets up for calculations for four 

"L independent simple random variables 

°/ Uses m-f unctions rep, elrep, kronf 

X = input ('Enter row matrix of X-values ') 

Y = input ('Enter row matrix of Y-values ') 

Z = input ('Enter row matrix of Z-values ') 

W = input ('Enter row matrix of W-values ') 

PX = input ('Enter X probabilities ') 

PY = input ('Enter Y probabilities ') 

PZ = input ('Enter Z probabilities ') 

PW = input ('Enter W probabilities ') 

n = length (X) ; 

m = length (Y) ; 

s = length (Z) ; 

r = length (W) ; 

[t,u] = meshgrid(X,Y) ; 

t = rep(t,r,s) ; 

u = rep(u,r,s) ; 

[v,w] = meshgrid(Z,W) ; 

v = elrep(v,m,n) ; '/, t,u,v,w matrices for joint calculations 

w = elrep (w,m,n) ; 

P = kronf (kronf (PZ,PV), kronf (PX,PY')); 

disp('Use array operations on matrices X, Y, Z, V) 

disp('PX, PY, PZ, PW, t, u, v, w, and P') 



17.1.13 Calculations for random variables 

ddbn.m 17.68 

Description of Code 

ddbn.m Uses the distribution of a simple random variable (or simple approximation) to plot a 
step graph for the distribution function Fx- 
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Code 



'/, DDBN file ddbn.m Step graph of distribution function 

'/. Version of 10/25/95 

7, Plots step graph of dbn function FX from 

'/, distribution of simple rv (or simple approximation) 

xc = input ('Enter row matrix of VALUES '); 

pc = input ('Enter row matrix of PROBABILITIES '); 

m = length (xc) ; 

FX = cumsum(pc) ; 

xt = [xc(l)-l-0.1*abs(xc(l)) xc xc(m)+l+0 . l*abs(xc(m) )] ; 

FX = [0 FX 1] ; °/. Artificial extension of range and domain 

stairs (xt ,FX) '/, Plot of stairstep graph 

hold on 

plot (xt ,FX, 'o' ) '/, Marks values at jump 

hold off 

grid 

xlabel('t') 

ylabeK'u = F(t)') 

title ( 'Distribution Function') 



cdbn.m 17.69 

Description of Code 

cdbn.m Plots a continuous graph of a distribution function of a simple random variable (or simple 
approximation) . 

Code 



7, CDBN file cdbn.m Continuous graph of distribution function 

'/. Version of 1/29/97 

7, Plots continuous graph of dbn function FX from 

y, distribution of simple rv (or simple approximation) 

xc = input ('Enter row matrix of VALUES '); 

pc = input ('Enter row matrix of PROBABILITIES '); 

m = length (xc) ; 

FX = cumsum(pc) ; 

xt = [xc(l)-0.01 xc xc(m)+0.01] ; 

FX = [0 FX FX(m)] ; '/, Artificial extension of range and domain 

plot(xt,FX) '/, Plot of continuous graph 

grid 

xlabel('t') 

ylabeK'u = F(t)') 

title ( 'Distribution Function') 
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simple, m 17.70 

Description of Code 

simple. m Calculates basic quantites for simple random variables from the distribution, input as 
row matrices X and PX. 

Code 



7, SIMPLE file simple. m Calculates basic quantites for simple rv 

'/„ Version of 6/18/95 

X = input ('Enter row matrix of X-values '); 

PX = input('Enter row matrix PX of X probabilities '); 

n = length (X) ; '/, dimension of X 

EX = dot(X,PX) °/„ E[X] 

EX2 = dot(X.~2,PX) '/. E[X~2] 

VX = EX2 - EX~2 '/. Var[X] 

dispC ') 

disp('Use row matrices X and PX for further calculations') 



jddbn.m 17.71 

Description of Code 

jddbn.m Representation of joint distribution function for simple pair by obtaining the value of 
Fxy a t the lower left hand corners of each grid cell. 

Code 



'/, JDDBN file jddbn.m Joint distribution function 

'/. Version of 10/7/96 

°/. Joint discrete distribution function for 

'/, joint matrix P (arranged as on the plane) . 

'/, Values at lower left hand corners of grid cells 

P = input ('Enter joint probability matrix (as on the plane) '); 

FXY = flipud(cumsum(flipud(P))) ; 

FXY = cumsum(FXY')'; 

disp('To view corner values for joint dbn function, call for FXY') 



jsimple.m 17.72 

Description of Code 

jsimple.m Calculates basic quantities for a joint simple pair {X, Y} from the joint distrsibution 
X, Y, P as in jcalc. Calculated quantities include means, variances, covariance, regression line, and 
regression curve (conditional expectation E [Y\X = £]). 

Code 
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'/, JSIMPLE file j simple. m Calculates basic quantities for joint simple rv 
'/„ Version of 5/25/95 

'/, The joint probabilities are arranged as on the plane 
'/. (the top row corresponds to the largest value of Y) 
P = input ('Enter JOINT PROBABILITIES (as on the plane) '); 
X = input ('Enter row matrix of VALUES of X '); 
Y = input ('Enter row matrix of VALUES of Y '); 
dispC ') 

PX = sum(P) ; '/, marginal distribution for X 

PY = f liplr (sum(P' )) ; '/, marginal distribution for Y 
XDBN = [X ; PX] ' ; 
YDBN = [Y ; PY] ' ; 
PT = idbn(PX,PY); 

D = total (abs(P - PT)); '/. test for difference 

if D > le-8 '/, to prevent roundoff error masking zero 

disp('{X,Y} is NOT independent') 

else 
disp('{X,Y} is independent') 
end 

dispC ') 

[t,u] = meshgrid (X,f liplr (Y)); 
EX = total (t.*P) '/. E[X] 

EY = total (u.*P) '/. E[Y] 

EX2 = total ((t.~2).*P) 7„ E[X~2] 

EY2 = total ((u. ~2) .*P) '/. E[Y~2] 

EXY = total (t.*u.*P) '/. E[XY] 

VX = EX2 - EX~2 '/. Var[X] 

VY = EY2 - EY~2 '/. Var[Y] 

cv = EXY - EX+EY; '/. Cov[X,Y] = E[XY] - E[X]E[Y] 

if abs(cv) > le-9 '/, to prevent roundoff error masking zero 

CV = cv 

else 
CV = 
end 

a = CV/VX '/, regression line of Y on X is 

b = EY - a*EX '/. u = at + b 

R = CV/sqrt(VX*VY) ; '/. correlation coefficient rho 

disp(['The regression line of Y on X is: u = ' ,num2str (a) , 't + ' ,num2str (b) ,] ) 
disp(['The correlation coefficient is: rho = ' ,num2str (R) ,] ) 
dispC ') 

eYx = sum(u.*P) ./PX; 
EYX = [X ; eYx] ' ; 

dispC Marginal dbns are in X, PX, Y, PY; to view, call XDBN, YDBN') 
disp('E[Y|X = x] is in eYx; to view, call for EYX') 
disp('Use array operations on matrices X, Y, PX, PY, t, u, and P') 



571 



japprox.m 17.73 

Description of Code 

japprox.m Assumes discrete setup and calculates basic quantities for a pair of random variables 
as in jsimple. Plots the regression line and regression curve. 

Code 



7, JAPPROX file japprox.m Basic quantities for ac pair {X,Y} 

'/. Version of 5/7/96 

7, Assumes tuappr has set X, Y, PX, PY, t, u, P 

EX = total (t.*P) '/. E[X] 

EY = total (u.*P) '/. E[Y] 

EX2 = total (t.~2.*P) '/. E[X"2] 

EY2 = total(u.~2.*P) '/. E[Y~2] 

EXY = total (t.*u.*P) '/. E[XY] 

VX = EX2 - EX~2 '/. Var[X] 

VY = EY2 - EY~2 '/. Var[Y] 

cv = EXY - EX*EY; '/. Cov[X,Y] = E[XY] - E[X]E[Y] 

if abs(cv) > le-9 '/, to prevent roundoff error masking zero 

CV = cv 
else 

CV = 
end 

a = CV/VX '/, regression line of Y on X is 

b = EY - a*EX '/. u = at + b 

R = CV/sqrt(VX*VY); 

disp(['The regression line of Y on X is: u = ' ,num2str (a) , 't + ' ,num2str (b) ,] ) 
disp(['The correlation coefficient is: rho = ' ,num2str (R) ,] ) 
dispC ') 

eY = sum(u.*P) ./sum(P); '/. eY(t) = E[Y|X = t] 
RL = a*X + b; 
plot(X,RL,X,eY,'-. ') 
grid 

title ( 'Regression line and Regression curve') 
xlabel('X values') 
ylabel('Y values') 

legend ( 'Regression line' , 'Regression curve') 
clear eY '/, To conserve memory 

clear RL 
disp(' Calculate with X, Y, t, u, P, as in joint simple case') 



17.1.14 Calculations and tests for independent random variables 

mgsum.m 17.74 

Description of Code 

mgsum.mf unction [z,pz] = mgsum(x,y ,px,py) determines the distribution for the sum of an 
independent pair of simple random variables from their distributions. 
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Code 



function [z,pz] = mgsum(x,y ,px,py) 

°/. MGSUM [z,pz] = mgsum(x,y ,px,py) Sum of two independent simple rv 

'/. Version of 5/6/96 

'/, Distribution for the sum of two independent simple random variables 

'/, x is a vector (row or column) of X values 

'/, y is a vector (row or column) of Y values 

'/. px is a vector (row or column) of X probabilities 

'/, py is a vector (row or column) of Y probabilities 

'/, z and pz are row vectors 

[a,b] = meshgrid(x,y) ; 

t = a+b; 

[c,d] = meshgrid(px,py) ; 

p = c . *d; 

[z,pz] = csort(t,p); 



mgsum3.m 17.75 

Description of Code 

mgsum3.mfunction [w,pw] = mgsum3(x,y ,z,px,py ,pz) extends mgsum to three random vari- 
ables by repeated application of mgsum. Similarly for mgsum4.m. 

Code 



function [w,pw] = mgsum3(x,y ,z,px,py ,pz) 

'/, MGSUM3 [w,pw] = mgsum3(x,y ,z,px,py ,y) Sum of three independent simple rv 

'/. Version of 5/2/96 

'/, Distribution for the sum of three independent simple random variables 

'/, x is a vector (row or column) of X values 

'/, y is a vector (row or column) of Y values 

'/, z is a vector (row or column) of Z values 

'/, px is a vector (row or column) of X probabilities 

'/, py is a vector (row or column) of Y probabilities 

'/, pz is a vector (row or column) of Z probabilities 

7, W and pW are row vectors 

[a, pa] = mgsum (x,y,px,py) ; 

[w,pw] = mgsum(a,z,pa,pz) ; 



mgnsum.m 17.76 
Description of Code 

mgnsum.mf unction [z,pz] = mgnsum(X,P) determines the distribution for a sum of n inde- 
pendent random variables. X an n-row matrix of X- values and P an n-row matrix of P- values 
(padded with zeros, if necessary, to make all rows the same length. 
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function [z,pz] = mgnsum(X,P) 

7, MGNSUM [z,pz] = mgnsum(X,P) Sum of n independent simple rv 
'/. Version of 5/16/96 

'/, Distribution for the sum of n independent simple random variables 
'/, X an n-row matrix of X-values 
7, P an n-row matrix of P-values 
'/, padded with zeros, if necessary 
'/, to make all rows the same length 
[n,r] = size(P) ; 
z = 0; 
pz = 1; 
for i = l:n 
x = X(i,:); 

P = P(i,0; 
x = x(find(p>0)) ; 
p = p(find(p>0)); 
[z,pz] = mgsum(z,x,pz,p) ; 
end 



mgsumn.m 17.77 

Description of Code 

mgsumn.mf unction [z,pz] = mgsumn(varargin) is an alternate to mgnsum, utilizing varar- 
gin in MATLAB version 5.1. The call is of the form [z,pz] = mgsumn( [xl;pl] , [x2;p2] , . . . , 
[xn ; pn] ) . 

Code 



function [z,pz] = mgsumn(varargin) 

'/. MGSUMN [z,pz] = mgsumn([xl;pl] , [x2;p2] , ..., [xn;pn]) 

'/„ Version of 6/2/97 Uses MATLAB version 5.1 

'/, Sum of n independent simple random variables 

'/, Utilizes distributions in the form [x;px] (two rows) 

7, Iterates mgsum 

n = length (varargin) ; °/. The number of distributions 

z = 0; '/, Initialization 

pz = 1; 

for i = l:n '/, Repeated use of mgsum 

[z,pz] = mgsum(z,varargin{i}-(l , : ) ,pz, varargin{i}-(2, : ) ) ; 
end 



diidsum.m 17.78 

Description of Code 

diidsum.mf unction [x,px] = diidsum(X,PX,n) determines the sum of n iid simple random 
variables, with the common distribution X.PX. 
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Code 



function [x,px] = diidsum(X,PX,n) 

'/, DIIDSUM [x,px] = diidsum(X,PX,n) Sum of n iid simple random variables 

'/„ Version of 10/14/95 Input rev 5/13/97 

°/. Sum of n iid rv with common distribution X, PX 

'/, Uses m-f unction mgsum 

x = X; '/, Initialization 

px = PX; 

for i = l:n-l 

[x,px] = mgsum (x, X, px, PX) ; 
end 



itest.m 17.79 

Description of Code 

itest.m Tests for independence the matrix P of joint probabilities for a simple pair {X, Y} of 
random variables. 

Code 



'/, ITEST file itest.m Tests P for independence 

'/. Version of 5/9/95 

'/, Tests for independence the matrix of joint 

'/, probabilities for a simple pair {X,Y} 

pt = input('Enter matrix of joint probabilities '); 

dispC ') 

px = sum(pt) ; '/, Marginal probabilities for X 

py = sum(pt'); '/• Marginal probabilities for Y (reversed) 

[a,b] = meshgrid(px,py) ; 

PT = a.*b; % Joint independent probabilities 

D = abs(pt - PT) > le-9; '/. Threshold set above roundoff 

if total (D) > 

disp('The pair {X,Y} is NOT independent') 

disp('To see where the product rule fails, call for D') 

else 

disp('The pair {X,Y} is independent') 

end 



idbn.m 17.80 

Description of Code 

idbn.mf unction p = idbn(px,py) uses marginal probabilities to determine the joint probability 
matrix (arranged as on the plane) for an independent pair of simple random variables. 
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Code 



function p = idbn(px,py) 

°/. IDBN p = idbn(px,py) Matrix of joint independent probabilities 

'/. Version of 5/9/95 

'/, Determines joint probability matrix for two independent 

'/, simple random variables (arranged as on the plane) 

[a,b] = meshgrid(px,f liplr (py) ) ; 

p = a.*b 



isimple.m 17.81 

Description of Code 

isimple.m Takes as inputs the marginal distributions for an independent pair {X, Y} of sim- 
ple random variables. Sets up the joint distribution probability matrix P as in idbn, and forms 
the calculating matrices t,u as in jcalc. Calculates basic quantities and makes available matrices 
X, Y, PX, PY, P, t, u for additional calculations. 

Code 



7, ISIMPLE file isimple.m Calculations for independent simple rv 

'/. Version of 5/3/95 

X = input ('Enter row matrix of X-values '); 

Y = input ('Enter row matrix of Y-values '); 

PX = input('Enter X probabilities '); 

PY = input('Enter Y probabilities '); 

[a,b] = meshgrid(PX,f liplr (PY)); 

P = a.*b; '/, Matrix of joint independent probabilities 

[t,u] = meshgrid(X,f liplr (Y) ) ; '/. t , u matrices for joint calculations 

EX = dot(X.PX) '/. E[X] 

EY = dot(Y,PY) */. E[Y] 

VX = dot(X.~2,PX) - EX~2 '/. Var[X] 

VY = dot(Y.~2,PY) - EY~2 '/. Var[Y] 

disp(' Use array operations on matrices X, Y, PX, PY, t, u, and P') 



17.1.15 Quantile functions for bounded distributions 

dquant.m 17.82 

Description of Code 

dquant.mfunction t = dquant (X , PX , U) determines the values of the quantile function for a 
simple random variable with distribution X, PX at the probability values in row vector U. The 
probability vector U is often determined by a random number generator. 

Code 
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function t = dquant (X,PX,U) 

7, DQUANT t = dquant (X,PX,U) Quantile function for a simple random variable 

'/. Version of 10/14/95 

'/, U is a vector of probabilities 

m = length (X) ; 

n = length (U) ; 

F = [0 cumsum(PX)+le-12] ; 

F(m+1) =1; '/. Makes maximum value exactly one 

if U(n) >= 1 '/, Prevents improper values of probability U 

U(n) = 1; 
end 
if U(l) <= 

U(l) = le-9; 
end 

f = rowcopy (F,n) ; 7, n rows of F 

u = colcopy (U,m) ; '/, m columns of U 

t = X*((f(:,l:m) < u)&(u <= f ( : ,2 :m+l) ) ) ' ; 



dquanplot.m 17.83 

Description of Code 

dquanplot.m Plots as a stairs graph the quantile function for a simple random variable X. The 
plot is the values of X versus the distribution function Fx- 

Code 



7, DQUANPLOT file dquanplot.m Plot of quantile function for a simple rv 

'/. Version of 7/6/95 

'/, Uses stairs to plot the inverse of FX 

X = input ('Enter VALUES for X '); 

PX = input ('Enter PROBABILITIES for X '); 

m = length (X) ; 

F = [0 cumsum(PX)] ; 

XP = [X X (m) ] ; 

stairs (F,XP) 

grid 

title('Plot of Quantile Function') 

xlabel('u') 

ylabeK't = Q(u)') 

hold on 

plot(F(2:m+l) ,X, 'o') '/„ Marks values at jumps 

hold off 



dsample.m 17.84 
Description of Code 
dsample.m Calculates a sample from a discrete distribution, determines the relative frequencies 
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of values, and compares with actual probabilities. Input consists of value and probability matrices 
for X and the sample size n. A matrix U is determined by a random number generator, and the 
m-function dquant is used to calculate the corresponding sample values. Various data on the sample 
are calculated and displayed. 

Code 



% DSAMPLE file dsample.m Simulates sample from discrete population 

'/. Version of 12/31/95 (Display revised 3/24/97) 

°/. Relative frequencies vs probabilities for 

'/, sample from discrete population distribution 

X = input ('Enter row matrix of VALUES '); 

PX = input ('Enter row matrix of PROBABILITIES '); 

n = input (' Sample size n '); 

U = rand(l,n) ; 

T = dquant (X.PX.U); 

[x,fr] = csort (T,ones(l, length (T)) ) ; 

disp(' Value Prob Rel freq') 

disp([x; PX; fr/n] ') 

ex = sum(T)/n; 

EX = dot(X.PX); 

vx = sum(T.~2)/n - ex~2; 

VX = dot(X.~2,PX) - EX~2; 

disp( ['Sample average ex = ' ,num2str (ex) ,] ) 

disp( ['Population mean E[X] = ' ,num2str (EX) ,] ) 

disp( ['Sample variance vx = ' ,num2str (vx) ,] ) 

disp( ['Population variance Var[X] = ' ,num2str (VX) ,] ) 



quanplot.m 17.85 

Description of Code 

quanplot.m Plots the quantile function for a distribution function Fx- Assumes the procedure 
dfsetup or acsetup has been run. A suitable set U of probability values is determined and the 
m-function dquant is used to determine corresponding values of the quantile function. The results 
are plotted. 

Code 



7, QUANPLOT file quanplot.m Plots quantile function for dbn function 

'/. Version of 2/2/96 

°/. Assumes dfsetup or acsetup has been run 

% Uses m-function dquant 

X = input ('Enter row matrix of values '); 

PX = input('Enter row matrix of probabilities '); 

h = input ( 'Probability increment h '); 

U = h:h:l; 

T = dquant (X,PX,U); 

U = [0 U 1] ; 
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Te = X(m) + abs(X(m))/20; 

T = [X(l) T Te] ; 

plot(U,T) °/. Plot rather than stairs for general case 

grid 

title('Plot of Quantile Function') 

xlabel('u') 

ylabeK't = Q(u)') 



qsample.m 17.86 

Description of Code 

qsample.m Simulates a sample for a given population density. Determines sample parameters and 
approximate population parameters. Assumes dfsetup or acsetup has been run. Takes as input the 
distribution matrices X, PX and the sample size n. Uses a random number generator to obtain the 
probability matrix U and uses the m-function dquant to determine the sample. Assumes dfsetup 
or acsetup has been run. 

Code 



7, QSAMPLE file qsample.m Simulates sample for given population density 

'/. Version of 1/31/96 

'/, Determines sample parameters 

7, and approximate population parameters. 

'/, Assumes dfsetup or acsetup has been run 

X = input ('Enter row matrix of VALUES '); 

PX = input ('Enter row matrix of PROBABILITIES '); 

n = input (' Sample size n = '); 

m = length (X) ; 

U = rand(l,n) ; 

T = dquant (X.PX.U); 

ex = sum(T)/n; 

EX = dot(X.PX); 

vx = sum(T.~2)/n - ex~2; 

VX = dot(X.~2,PX) - EX~2; 

disp('The sample is in column vector T') 

disp( ['Sample average ex = ' , num2str (ex) ,] ) 

disp( ['Approximate population mean E(X) = ' ,num2str (EX) ,] ) 

disp( ['Sample variance vx = ' ,num2str (vx) ,] ) 

disp( ['Approximate population variance V(X) = ' ,num2str (VX) ,] ) 



targetset.m 17.87 

Description of Code 

targetset.m Setup for arrival at a target set of values. Used in conjunction with the m-procedure 
targetrun to determine the number of trials needed to visit k of a specified set of target values. 
Input consists of the distribution matrices X, PX and the specified set E of target values. 
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'/, TARGETSET file targetset.m Setup for sample arrival at target set 

'/. Version of 6/24/95 

X = input ('Enter population VALUES '); 

PX = input('Enter population PROBABILITIES '); 

ms = length (X) ; 

x = l:ms; '/, Value indices 

disp('The set of population values is') 

disp(X); 

E = input ('Enter the set of target values '); 

ne = length (E) ; 

e = zeros(l ,ne) ; 

for i = l:ne 

e(i) = dot(E(i) == X,x); '/, Target value indices 
end 

F = [0 cumsum(PX)] ; 
A = F(l:ms); 
B = F(2:ms+1) ; 
disp('Call for targetrun') 



targetrun.m 17.88 

Description of Code 

targetrun.m Assumes the m-file targetset has provided the basic data. Input consists of the 
number r of repetitions and the number k of the target states to visit. Calculates and displays 
various results. 

Code 



'/, TARGETRUN file targetrun.m Number of trials to visit k target values 

'/. Version of 6/24/95 Rev for Version 5.1 1/30/98 

'/, Assumes the procedure targetset has been run. 

r = input('Enter the number of repetitions '); 

disp('The target set is') 

disp(E) 

ks = input ('Enter the number of target values to visit '); 

if isempty(ks) 

ks = ne; 
end 
if ks > ne 

ks = ne; 
end 

clear T '/, Trajectory in value indices (reset) 

R0 = zeros (1, ms) ; '/, Indicator for target value indices 
R0(e) = ones(l,ne); 

S = zeros(l,r); % Number of trials for each run (reset) 
for k = l:r 
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R = R0; 

i = 1; 

while sum(R) > ne - ks 

u = rand (1,1) ; 

s = ((A < u)&(u <= B))*x'; 

if R(s) == 1 °/. Deletes indices as values reached 
R(s) = 0; 

end 

T(i) = s; 

i = i+1; 
end 

S(k) = i-1; 
end 
if r == 1 

disp(['The number of trials to completion is ' , int2str (i-1) ,] ) 
disp(['The initial value is ' ,num2str (X(T(1) ) ) ,] ) 
disp(['The terminal value is ' ,num2str (X(T(i-l) ) ) ,] ) 
N = l:i-l; 
TR = [N;X(T)]'; 

disp('To view the trajectory, call for TR') 
else 

[t,f] = csort (S,ones(l,r) ) ; 
D = [t;f]'; 
P = f/r; 
AV = dot(t.p) ; 

SD = sqrt(dot(t.~2,p) - AV~2) ; 
MN = min(t) ; 
MX = max(t) ; 

disp(['The average completion time is ' ,num2str (AV) ,] ) 
disp(['The standard deviation is ' ,num2str (SD) ,] ) 
disp(['The minimum completion time is ' , int2str (MN) ,] ) 
disp(['The maximum completion time is ' , int2str (MX) ,] ) 
dispC ') 

disp('To view a detailed count, call for D.') 
disp('The first column shows the various completion times;') 
disp('the second column shows the numbers of trials yielding those times') 
plot (t , cumsum(p) ) 
grid- 
title ( 'Fraction of Runs t Steps or Less') 
ylabel( 'Fraction of runs') 

xlabel('t = number of steps to complete run') 
end 
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17.1.16 Compound demand 

The following pattern provides a useful model in many situations. Consider 



iv 



D 



fe=0 



(17.2) 



where Yq = 0, and the class {Yk : 1 < fc} is iid, independent of the counting random variable N. One natural 
interpretation is to consider JV to be the number of customers in a store and Y^ the amount purchased by 
the icth customer. Then D is the total demand of the actual customers. Hence, we call D the compound 
demand. 

gend.m 17.89 

Description of Code 

gend.m Uses coefficients of the generating functions for JV and Y to calculate, in the integer case, 
the marginal distribution for the compound demand D and the joint distribution for {N, D}. 

Code 



7, GEND file gend.m Marginal and joint dbn for integer compound demand 

'/. Version of 5/21/97 

'/, Calculates marginal distribution for compound demand D 

7, and joint distribution for {N,D} in the integer case 

'/, Do not forget zero coefficients for missing powers 

'/, in the generating functions for N, Y 

disp('Do not forget zero coefficients for missing powers') 

gn = input ('Enter gen fn COEFFICIENTS for gN '); 

gy = input ('Enter gen fn COEFFICIENTS for gY '); 

'/, Highest power in gN 

'/, Highest power in gY 

'/, Base for generating P 

'/, Initialization 

'/. First row of P (P(N=0) in the first position) 

'/, Row by row determination of P 

°/„ Successive powers of gy 

'/, Successive rows of P 

'/, Probability for each possible value of D 
'/, Location of nonzero N probabilities 
'/, Location of nonzero D probabilities 
'/, Removal of zero rows and columns 
'/, Orientation as on the plane 

'/, N values with positive probabilites 

'/, Positive N probabilities 

'/, All possible values of Y 

'/, Y values with positive probabilities 

'/, Positive Y proabilities 

'/, All possible values of D 

'/, Positive D probabilities 

'/, D values with positive probabilities 

'/, Display combination 



n 


= length (gn) - 1; 




m 


= length (gy) - 1; 




P 


= zeros(n + l,n*m 


+ i); 


y 


= l; 




p(i 


.,D = gn(D; 




for i = l:n 






y = conv(y.gy) ; 






P(i+l,l:i*m+l) = 


y*gn(i+l) 


end 




PD 


= sum(P) ; 




a 


= find(gn) ; 




b 


= find(PD); 




P 


= P(a,b); 




P 


= rot90(P) ; 




N 


= 0:n; 




N 


= N(a); 




PN 


= gn(a); 




Y 


= 0:m; 




Y 


= Y(find(gy)); 




PY 


= gy(find(gy)) ; 




D 


= 0:n*m; 




PD 


= PD(b); 




D 


= D(b); 




gD 


= [D; PD]'; 
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disp('Results are in N, PN, Y, PY, D, PD, P') 
disp('May use jcalc or jcalcf on N, D, P') 
disp('To view distribution for D, call for gD') 



gendf.m 17.90 

Description of Code 

gendf.mf unction [d,pd] = gendf (gn,gy) is a function version of gend, which allows arbitrary 
naming of the variables. Calculates the distribution for D, but not the joint distribution for {N, D}. 

Code 



function [d,pd] = gendf (gn,gy) 

7. GENDF [d,pd] = gendf (gN,gY) Function version of gend.m 

7. Calculates marginal for D in the integer case 

7. Version of 5/21/97 

7. Do not forget zero coefficients for missing powers 

'/, in the generating functions for N, Y 



n = length (gn) - 1; 
m = length (gy) - 1; 
P = zeros(n + l,n*m + 1); 

y = i; 

P(l,l) = gn(l); 
for i = l:n 

y = conv(y.gy) ; 

P(i+l,l:i*m+l) = y*gn(i+l) : 
end 

PD = sum(P) ; 
D = 0:n*m; 
b = find(PD); 
d = D(b); 
pd = PD(b); 



7. Highest power in gN 

7, Highest power in gY 

'/, Base for generating P 

7o Initialization 

'/. First row of P (P(N=0) in the first position) 

'/, Row by row determination of P 

7. Successive powers of gy 

7, Successive rows of P 

7o Probability for each possible value of D 

'/, All possible values of D 

'/, Location of nonzero D probabilities 

7. D values with positive probabilities 

7. Positive D probabilities 



mgd.m 17.91 

Description of Code 

mgd.m Uses coefficients for the generating function for JV and the distribution for simple Y to 
calculate the distribution for the compound demand. 

Code 



7. MGD file mgd.m Moment generating function for compound demand 

'/. Version of 5/19/97 

'/, Uses m-f unctions csort, mgsum 

disp('Do not forget zeros coefficients for missing') 

disp('powers in the generating function for N') 



583 



dispC ') 

g = input ('Enter COEFFICIENTS for gN '); 

y = input ('Enter VALUES for Y '); 

p = input ('Enter PROBABILITIES for Y '); 

n = length(g) ; '/, Initialization 

a = 0; 

b = 1; 

D = a; 

PD = g(l); 

for i = 2:n 

[a,b] = mgsum(y,a,p,b) ; 

D = [D a] ; 

PD = [PD b*g(i)]; 

[D,PD] = csort(D.PD) ; 
end 

r = find(PD>le-13); 

D = D(r) ; 7, Values with positive probability 

PD = PD(r); % Corresponding probabilities 

mD = [D; PD] ' ; '/. Display details 

disp('Values are in row matrix D; probabilities are in PD.') 
disp('To view the distribution, call for mD.') 



mgdf.m 17.92 

Description of Code 

mgdf.mf unction [d,pd] = mgdf (pn , y , py) is a function version of mgd, which allows arbitrary 
naming of the variables. The input matrix pn is the coefficient matrix for the counting random 
variable generating function. Zeros for the missing powers must be included. The matrices y,py 
are the actual values and probabilities of the demand random variable. 

Code 



function [d,pd] = mgdf (pn,y ,py) 

7, MGDF [d,pd] = mgdf (pn,y ,py) Function version of mgD 

'/. Version of 5/19/97 

'/, Uses m-functions mgsum and csort 

7, Do not forget zeros coefficients for missing 

'/, powers in the generating function for N 

n = length(pn); '/, Initialization 

a = 0; 

b = 1; 

d = a; 

pd = pn(l) ; 

for i = 2:n 

[a,b] = mgsum(y,a,py,b) ; 

d = [d a] ; 

pd = [pd b*pn(i)] ; 

[d,pd] = csort(d,pd); 
end 
a = f ind(pd>le-13) ; '/, Location of positive probabilities 



584 CHAPTER 17. APPENDICES 

pd = pd(a); % Positive probabilities 

d = d(a) ; '/, D values with positive probability 



randbern.m 17.93 

Description of Code 

randbern.m Let S be the number of successes in a random number JV of Bernoulli trials, with 
probability p of success on each trial. The procedure randbern takes as inputs the probability p 
of success and the distribution matrices N, PN for the counting random variable JV and calculates 
the joint distribution for {N, S} and the marginal distribution for S. 

Code 



'/, RANDBERN file randbern. m Random number of Bernoulli trials 

'/. Version of 12/19/96; notation modified 5/20/97 

% Joint and marginal distributions for a random number of Bernoulli trials 

'/, N is the number of trials 

'/, S is the number of successes 

p = input ('Enter the probability of success '); 

N = input ('Enter VALUES of N '); 

PN = input ('Enter PROBABILITIES for N '); 

n = length (N) ; 



m = 


max(N) ; 




S = 


0:m; 




P = 


zeros(n 


,m+l) 


for 


i = l:n 





P(i,l:N(i)+l) = PN(i)*ibinom(N(i),p,0:N(i)); 
end 

PS = sum(P) ; 
P = rot90(P) ; 
disp(' Joint distribution N, S, P, and marginal PS') 



17.1.17 Simulation of Markov systems 

inventory!.. m 17.94 

Description of Code 

inventoryl.m Calculates the transition matrix for an (m,M) inventory policy. At the end of 
each period, if the stock is less than a reorder point m, stock is replenished to the level M. Demand 
in each period is an integer valued random variable Y. Input consists of the parameters m, M and 
the distribution for Y as a simple random variable (or a discrete approximation). 

Code 
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'/. INVENT0RY1 file inventoryl .m Generates P for (m,M) inventory policy 

'/„ Version of 1/27/97 

'/, Data for transition probability calculations 

'/, for (m,M) inventory policy 

M = input ('Enter value M of maximum stock '); 

m = input ('Enter value m of reorder point '); 

Y = input ('Enter row vector of demand values '); 

PY = input('Enter demand probabilities '); 

states = 0:M; 

ms = length(states) ; 

my = length (Y) ; 

'/, Calculations for determining P 

[y,s] = meshgrid(Y, states) ; 

T = max(0,M-y) .*(s < m) + max(0, s-y) . * (s >= m) ; 

P = zeros(ms,ms) ; 

for i = l:ms 

[a,b] = meshgrid(T(i, :), states) ; 

P(i,:) = PY*(a==b)'; 
end 
disp(' Result is in matrix P') 



branchp.m 17.95 

Description of Code 

branchp.m Calculates the transition matrix for a simple branching process with a specified max- 
imum population. Input consists of the maximum population value M and the coefficient matrix 
for the generating function for the individual propagation random variables Z\. The latter matrix 
must include zero coefficients for missing powers. 

Code 



'/, BRANCHP file branchp.m Transition P for simple branching process 

'/. Version of 7/25/95 

'/, Calculates transition matrix for a simple branching 

7, process with specified maximum population. 

disp('Do not forget zero probabilities for missing values of Z') 

PZ = input ('Enter PROBABILITIES for individuals '); 

M = input ('Enter maximum allowable population '); 

mz = length (PZ) - 1; 

EZ = dot(0:mz,PZ) ; 

disp(['The average individual propagation is ' ,num2str (EZ) ,] ) 

P = zeros(M+l,M+l) ; 

Z = zeros (M, M*mz+1) ; 

k = 0:M*mz; 

a = min(M,k) ; 

z - 1; 

P(1,D = 1; 

for i = 1:M '/, Operation similar to gend 
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z = conv(PZ,z) ; 

Z(i,l:i*mz+1) = z; 

[t,p] = csort(a,Z(i, :)) ; 

P(i+1,:) = p; 
end 

disp('The transition matrix is P') 
disp('To study the evolution of the process, call for branchdbn') 



chainset.m 17.96 

Description of Code 

chainset.m Sets up for simulation of Markov chains. Inputs are the transition matrix P the set 
of states, and an optional set of target states. The chain generating procedures listed below assume 
this procedure has been run. 

Code 



7, CHAINSET file chainset.m Setup for simulating Markov chains 

7, Version of 1/2/96 Revise 7/31/97 for version 4.2 and 5.1 

P = input ('Enter the transition matrix '); 

ms = length(P(l, :)) ; 

MS = l:ms; 

states = input ('Enter the states if not l:ms '); 

if isempty(states) 

states = MS; 
end 

disp(' States are') 
disp( [MS; states] ') 

PI = input('Enter the long-run probabilities '); 
F = [zeros (1 ,ms) ; cumsum(P' )] ' ; 
A = F(:,MS); 
B = F(: ,MS+1); 

e = input ('Enter the set of target states '); 
ne = length (e) ; 
E = zeros(l ,ne) ; 
for i = l:ne 

E(i) = MS(e(i)==states) ; 
end 

dispC ') 
disp('Call for for appropriate chain generating procedure') 



mchain.m 17.97 

Description of Code 

mchain.m Assumes chainset has been run. Generates trajectory of specified length, with specified 
initial state. 

Code 
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7. MCHAIN file mchain.m Simulation of Markov chains 

'/. Version of 1/2/96 Revised 7/31/97 for version 4.2 and 5.1 

7. Assumes the procedure chainset has been run 

n = input ('Enter the number n of stages '); 

st = input('Enter the initial state '); 

if ~isempty(st) 

s = MS(st==states) ; 
else 

s = 1; 
end 

T = zeros(l,n); 7. Trajectory in state numbers 

U = rand(l,n) ; 
for i = l:n 

T(i) = s; 

s = ((A(s,:) < U(i))&(U(i) <= B(s,:)))*MS'; 
end 

N = 0:n-l; 
tr = [N; states (T)] ' ; 
nlO = min(n,ll) ; 
TR = tr(l:nlO, :); 
f = ones(l,n)/n; 
[sn,p] = csort(T,f); 
if isempty(PI) 

disp(' State Frac') 

disp( [states; p]') 
else 

dispC State Frac PI') 

disp( [states; p; PI]') 
end 
disp('To view the first part of the trajectory of states, call for TR') 



arrival, m 17.98 

Description of Code 

arrival. m Assumes chainset has been run. Calculates repeatedly the arrival time to a prescribed 
set of states. 



Code 



7. ARRIVAL file arrival. m Arrival time to a set of states 

7, Version of 1/2/96 Revised 7/31/97 for version 4.2 and 5.1 

'/, Calculates repeatedly the arrival 

7. time to a prescribed set of states. 

7. Assumes the procedure chainset has been run. 

r = input('Enter the number of repetitions '); 

disp('The target state set is:') 

disp(e) 
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st = input ( 'Enter the initial state '); 
if ~isempty(st) 

si = MS(st==states) ; '/, Initial state number 
else 

si = 1; 
end 

clear T '/, Trajectory in state numbers (reset) 

S = zeros(l,r); '/, Arrival time for each rep (reset) 
TS = zeros(l,r); '/, Terminal state number for each rep (reset) 
for k = l:r 

R = zeros(l,ms); '/, Indicator for target state numbers 

R(E) = ones(l,ne); % reset for target state numbers 

s = si ; 

T(l) = s; 

i = 1; 

while R(s) ~= 1 '/, While s is not a target state number 
u = rand (1,1) ; 

s = ((A(s,:) < u)&(u <= B(s,:)))*MS>; 
i = i+1; 
T(i) = s; 

end 

S(k) = i-1; '/, i is the number of stages; i-1 is time 

TS(k) = T(i); 
end 

[ts,ft] = csort (TS,ones(l ,r) ) ; '/, ts = terminal state numbers ft = frequencies 
fts = ft/r; '/, Relative frequency of each ts 

[a, at] = csort(TS,S); '/, at = arrival time for each ts 

w = at . /f t ; '/, Average arrival time for each ts 

RES = [states (ts); fts; w] ' ; 
dispC ') 
if r == 1 

disp(['The arrival time is ' , int2str(i-l) ,] ) 

disp(['The state reached is ' ,num2str (states(ts) ) ,] ) 

N = 0:i-l; 

TR = [N;states(T)] ' ; 

disp('To view the trajectory of states, call for TR') 
else 

disp(['The result of ' , int2str (r) , ' repetitions is:']) 

disp('Term state Rel Freq Av time') 

disp(RES) 

dispC ') 

[t,f] = csort (S,ones(l ,r) ) ; '/, t = arrival times f = frequencies 

p = f/r; '/, Relative frequency of each t 

dbn = [t ; p] ' ; 

AV = dot(t,p) ; 

SD = sqrt(dot(t.~2,p) - AV~2) ; 

MN = min(t) ; 

MX = max(t) ; 

disp(['The average arrival time is ' ,num2str (AV) ,] ) 

disp(['The standard deviation is ' ,num2str (SD) ,] ) 

disp(['The minimum arrival time is ' , int2str (MN) ,] ) 
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disp(['The maximum arrival time is ' , int2str (MX) ,] ) 
disp('To view the distribution of arrival times, call for dbn') 
disp('To plot the arrival time distribution, call for plotdbn') 
end 



recurrence, m 17.99 

Description of Code 

recurrence. m Assumes chainset has been run. Calculates repeatedly the recurrence time to a 
prescribed set of states, if initial state is in the set; otherwise calculates the arrival time. 

Code 



'/, RECURRENCE file recurrence. m Recurrence time to a set of states 

'/. Version of 1/2/96 Revised 7/31/97 for version 4.2 and 5.1 

'/, Calculates repeatedly the recurrence time 

7, to a prescribed set of states, if initial 

'/, state is in the set; otherwise arrival time. 

°/o Assumes the procedure chainset has been run. 

r = input('Enter the number of repititions '); 

disp('The target state set is:') 

disp(e) 

st = input('Enter the initial state '); 

if ^isempty(st) 

si = MS(st==states) ; '/, Initial state number 
else 

si = 1; 
end 

clear T '/, Trajectory in state numbers (reset) 

S = zeros(l,r); '/, Recurrence time for each rep (reset) 
TS = zeros(l,r); '/, Terminal state number for each rep (reset) 
for k = l:r 

R = zeros(l,ms); '/, Indicator for target state numbers 
R(E) = ones(l,ne); % reset for target state numbers 
s = si; 
T(l) = s; 
i = 1; 
if R(s) == 1 
u = rand(l , 1) ; 

s = ((A(s,:) < u)&(u <= B(s,:)))*MS>; 
i = i+1; 
T(i) = s; 
end 

while R(s) ~= 1 '/, While s is not a target state number 
u = rand (1,1) ; 

s = ((A(s,:) < u)&(u <= B(s,:)))*MS'; 
i = i+1; 
T(i) = s; 
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end 

S(k) = i-1; '/, i is the number of stages; i-1 is time 

TS(k) = T(i); 
end 

[ts,ft] = csort (TS,ones(l ,r)) ; '/, ts = terminal state numbers ft = frequencies 
fts = ft/r; '/, Relative frequency of each ts 

[a,tt] = csort(TS,S); '/, tt = total time for each ts 
w = tt./ft; '/, Average time for each ts 

RES = [states (ts); fts; w] ' ; 
dispC ') 
if r == 1 

disp(['The recurrence time is ' ,int2str (i-1) ,] ) 

disp(['The state reached is ' ,num2str (states(ts) ) ,] ) 

N = 0:i-l; 

TR = [N;states(T)] ' ; 

disp('To view the trajectory of state numbers, call for TR') 
else 

disp(['The result of ' , int2str (r) , ' repetitions is:']) 

disp('Term state Rel Freq Av time') 

disp(RES) 

dispC ') 

[t,f] = csort (S,ones(l ,r) ) ; '/, t = recurrence times f = frequencies 

p = f/r; '/, Relative frequency of each t 

dbn = [t ; p] ' ; 

AV = dot(t,p) ; 

SD = sqrt(dot(t.~2,p) - AV~2) ; 

MN = min(t) ; 

MX = max(t) ; 

disp(['The average recurrence time is ' ,num2str (AV) ,] ) 

disp(['The standard deviation is ' ,num2str (SD) ,] ) 

disp(['The minimum recurrence time is ' , int2str (MN) ,] ) 

disp(['The maximum recurrence time is ' , int2str (MX) ,] ) 

disp('To view the distribution of recurrence times, call for dbn') 

disp('To plot the recurrence time distribution, call for plotdbn') 
end 



kvis.m 17.100 

Description of Code 

kvis.m Assumes chainset has been run. Calculates repeatedly the time to complete visits to a 
specified k of the states in a prescribed set. 

Code 



'/, KVIS file kvis.m Time to complete k visits to a set of states 

'/. Version of 1/2/96 Revised 7/31/97 for version 4.2 and 5.1 

'/, Calculates repeatedly the time to complete 

'/, visits to k of the states in a prescribed set. 

'/, Default is visit to all the target states. 
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'/, Assumes the procedure chainset has been run. 

r = input ( 'Enter the number of repetitions '); 

disp('The target state set is:') 

disp(e) 

ks = input ('Enter the number of target states to visit '); 

if isempty(ks) 

ks = ne; 
end 
if ks > ne 

ks = ne; 
end 

st = input('Enter the initial state '); 
if ~isempty(st) 

si = MS(st==states) ; '/, Initial state number 
else 

si = 1; 
end 

dispC ') 

clear T '/, Trajectory in state numbers (reset) 

RO = zeros(l,ms); '/, Indicator for target state numbers 

R0(E) = ones(l,ne); '/, reset 

S = zeros(l,r); '/, Terminal transitions for each rep (reset) 

for k = l:r 

R = RO; 

s = si; 

if R(s) == 1 
R(s) = 0; 

end 

i = 1; 

T(l) = s; 

while sum(R) > ne - ks 
u = rand (1,1) ; 

s = ((A(s,:) < u)&(u <= B(s,:)))*MS>; 
if R(s) == 1 

R(s) = 0; 
end 

i = i+1; 
T(i) = s; 

end 

S(k) = i-1; 
end 
if r == 1 

disp(['The time for completion is ' ,int2str (i-1) ,] ) 

N = 0:i-l; 

TR = [N;states(T)] ' ; 

disp('To view the trajectory of states, call for TR') 
else 

[t,f] = csort (S,ones(l ,r) ) ; 

P = f/r; 

D = [t;f]'; 

AV = dot(t,p) ; 
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SD = sqrt(dot(t.~2,p) - AV~2) ; 
MN = min(t) ; 
MX = max(t) ; 

disp(['The average completion time is ' ,num2str (AV) ,] ) 
disp(['The standard deviation is ' ,num2str (SD) ,] ) 
disp(['The minimum completion time is ' , int2str (MN) ,] ) 
disp(['The maximum completion time is ' , int2str (MX) ,] ) 
dispC ') 

disp('To view a detailed count, call for D.') 
disp('The first column shows the various completion times;') 
disp('the second column shows the numbers of trials yielding those times') 
end 



plotdbn 17.101 

Description of Code 

plotdbn Used after m-procedures arrival or recurrence to plot arrival or recurrence time distribu- 
tion. 

Code 



'/. PLOTDBN file plotdbn. m 

'/. Version of 1/23/98 

7, Plot arrival or recurrence time dbn 

'/, Use after procedures arrival or recurrence 

7, to plot arrival or recurrence time distribution 

plot(t,p,'-',t,p,'+') 

grid 

title ('Time Distribution') 

xlabel('Time in number of transitions') 

ylabel( 'Relative frequency') 



17.2 Appendix B to Applied Probability: some mathematical aids 2 

17.2.1 Series 

1. : Geometric series From the expression (1 — r) (l + r + r 2 + ... + r n ) = 1 — r n+1 , we obtain 

n i ro+l 

5> fe =^— forr^l (17.3) 

fe=0 

For \r\ < 1, these sums converge to the geometric series J2T=o fk = T^r 
Differentiation yields the following two useful series: 

oo 1 oo ^ 

S"kr k - 1 = ^ for \r\ < 1 and V fc (fc - 1) r fe - 2 = ^ for \r\ < 1 (17.4) 

h (i-^) h (i--) 3 ' ' 



2 This content is available online at <http://cnx.Org/content/m23990/l.5/>. 
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For the finite sum, differentiation and algebraic manipulation yields 



5>» 

k=0 



_ x _ l-r"[l + n(l-r)] 
:1 (1-0 2 



which converges to 



1 



(1-rY 



for Irl < 1 



(17.5) 



I: k 



2. : Exponential series. e x = SfeLo %T an( ^ e x = X^fclo ( — 1) §T ^ or an y x 

Simple algebraic manipulation yields the following equalities useful for the Poisson distribution: 



Y, k ji = x E *[■ and E^- 1 )^- 

Sums of powers of integers X^=i * = ^2 — E™=1 ^ 2 



E 

fe=n-2 






(17.6) 



17.2,2 Some useful integrals 

1. : The gamma functionr (r) = L t r ~ 1 e~ t dt for r > 

Integration by parts shows T (r) = (r — 1) T (r — 1) for r > 1 
By induction T (r) = (r — 1) (r — 2) • • • (r — fc) T (r — k) for r > k 
For a positive integer n, T (n) = (n — 1)! with r (1) = 0! = 1 

2. : By a change of variable in the gamma integral, we obtain 



t r e' xt dt 



T(r+1) 



r > - 1, A > 



(17.7) 



A well known indefinite integral gives 

ter xt dt= — e~ Xa (1 + Aa) and 
For any positive integer m, 



t m e' Xt dt 



t 2 er Xat dt 



-Xa 



A 3 



l + Ao+(Ao)/2 (17. 



- A a 



A m+1 



1 + Aa + 



(AaT 
2! 



(Aa)' 



??)! 



(17.9) 



4. : The following integrals are important for the Beta distribution. 

1 r n ,s, r(r + l)r(a + l) 
u (1 — u) du= — ^-, — r > 



o 



T(r + s + 2) 
For nonnegative integers m,n J Q u m (l — u) n du = ."^"l-^ 



1, s> - 1 



(17.10) 



17.2.3 Some basic counting problems 

We consider three basic counting problems, which are used repeatedly as components of more complex 
problems. The first two, arrangements and occupancy are equivalent. The third is a basic matching 
problem. 

I. Arrangements of r objects selected from among n distinguishable objects. 

a. The order is significant. 

b. The order is irrelevant. 
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For each of these, we consider two additional alternative conditions. 

a. No element may be selected more than once. 

b. Repitition is allowed. 

II. Occupancy of n distinct cells by r objects. These objects are 

a. Distinguishable. 

b. Indistinguishable. 

The occupancy may be 

a. Exclusive. 

b. Nonexclusive (i.e., more than one object per cell) 

The results in the four cases may be summarized as follows: 

a. 1. Ordered arrangements, without repetition (permutations). Distinguishable objects, exclu- 

sive occupancy. 

Tl 

P(n,r) = - — (17.11) 

(n — r)\ 

2. Ordered arrangements, with repitition allowed. Distinguishable objects, nonexclusive occu- 
pancy. 

U{n,r)=n r (17.12) 

b. 1. Arrangements without repetition, order irrelevant (combinations). Indistinguishable ob- 

jects, exclusive occupancy. 

„ , n! P (n, r) 

C (n, r) = — = -±j-L (17.13) 

r! [n — ry. r\ 

2. Unordered arrangements, with repetition. Indistinguishable objects, nonexclusive occupancy. 

S(n,r) = C(n + r-l,r) (17.14) 

III. Matchingn distinguishable elements to a fixed order. Let M (n, k) be the number of permutations 
which give k matches. 

Example 17.1: n = 5 
Natural order 12 3 4 5 

Permutation 3 2 5 4 1 (Two matches- positions 2, 4) 

We reduce the problem to determining m (n, 0), as follows: 

1. Select k places for matches in C (n, k) ways. 

2. Order the n— k remaining elements so that no matches in the other n — k places. 

M(n,k) = C(n,k)M(n-k,0) (17.15) 

Some algebraic trickery shows that M (n, 0) is the integer nearest n\/e. These are easily calculated by 
the MATLAB command M = round(gamma(n+l)/exp(l)) For example 

> M = round(gamma([3:10]+l)/exp(l)) ; 
> disp([3:6;M(l:4);7:10;M(5:8)]') 

3 2 7 1854 

4 9 8 14833 

5 44 9 133496 

6 265 10 1334961 
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17.2.4 Extended binomial coefficients and the binomial series 

: The ordinary binomial coefficient is C (n, k) = fc ,,^ fc ,, for integers ra > 0, < k < n 
For any real x, any integer k, we extend the definition by 

C (x, 0) = 1, C (x, k) = for k < 0, and C (n, k) = for a positive integer k > n (17.16) 

and 

„, s x(x-l)(x-2)---(x-k + l) 

C(x,k) = — — /-, ^—t- otherwise 17.17 

k\ 

Then Pascal's relation holds: C (x, k) = C (x - 1, k - 1) + C (x - 1, k) 
The power series expansion about t = shows 

(1 + tf = 1 + C(x, l)t + C{x,2)t 2 + ■■■ Vx, — 1 < * < 1 (17.18) 

For x = n, a positive integer, the series becomes a polynomial of degree n. 

17.2.5 Cauchy's equation 

1. Let f be a real- valued function defined on (0, oo), such that 

a. / (t + u) = f (t) + f (u) for t, u > 0, and 

b. There is an open interval I on which f is bounded above (or is bounded below). 

Then f(t) = f(l)t V£>0 

2. Let f be a real-valued function defined on (0, oo) such that 

a. f(t + u) = f(t)f(u) Vi, w>0, and 

b. There is an interval on which f is bounded above. 

Then, either / (£) = for t > , or there is a constant a such that / (t) = e at for t > 
[For a proof, see Billingsley, Probability and Measure, second edition, appendix A20] 

17.2.6 Countable and uncountable sets 

A set (or class) is countable iff either it is finite or its members can be put into a one-to-one correspondence 
with the natural numbers. 

Examples 

• The set of odd integers is countable. 

• The finite set {n : 1 < n < 1000} is countable. 

• The set of all rational numbers is countable. (This is established by an argument known as diagonal- 
ization) . 

• The set of pairs of elements from two countable sets is countable. 

• The union of a countable class of countable sets is countable. 

A set is uncountable iff it is neither finite nor can be put into a one-to-one correspondence with the natural 
numbers. 

Examples 

• The class of positive real numbers is uncountable. A well known operation shows that the assumption 
of countability leads to a contradiction. 

• The set of real numbers in any finite interval is uncountable, since these can be put into a one-to-one 
correspondence of the class of all positive reals. 
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17.3 Appendix C: Data on some common distributions 3 
17.3.1 Discrete distributions 

1. Indicator functions X = I E P {X = 1) = P (E) = p P {X = 0) = q = 1 - p 

E[X}=p Var[X}=pq M x {s)=q + pe s g x (s) = q + ps (17.19) 

2. Simple random variable X = y] i ^ 1 tjl^ ( a primitive form) P (Aj) = Pi 

n n n 

E\x] = j2 tiPi Var [*] = Y, *?**«< - 2 J2 WjPiPj M * («) = J2 p ^ sU ( 17 - 2 °) 

i—l i— 1 i<j i— 1 

3. Binomial(n,p)V = Yh=i ^ with 1^ : 1 < « < n} iid P {E t ) = p 

P(X = k) = C (n, k) p k q n - k (17.21) 

E [X] = np Var [X] = npq M x (s) = (q + pe s ) n g x (s) = (q + ps) n (17.22) 

MATLAB: P (X = k) = ibinom (n,p, k) P (X > k) = cbinom (n,p, k) 

4. Geometric(p)P (X = k) = pq k V k > 

E [X] = q/p Var [X] = q/p 2 M x (s) = -JL- g x (a) = -?— (17.23) 

1 — qe s 1 — qs 

If Y — 1 ~ geometric (p), so that P (V = k) = pq k ~ 1 V fc > 1, then 

E[Y] = l/p X^[X]=q/p 2 M Y {s) = -^— a g Y (s)=-^- (17.24) 

1 — qe s 1 — qs 

5. Negative binomial (m,p). X is the number of failures before the mth success. P (X = k) = 
C{m + k-l,m-l)p m q k V k > 0. 

£ [V] = mg/p Var [V] = mg/p 2 M x (s) = (jZ^) 9x (a) = ( Y^q~a) (1? " 25) 

For Y m = X m + m, the number of the trial on which mth success occurs. P (Y = k) = 
C(k- 1, m- l)p m q k ~ m Vfc>m. 

/ s \ m / \ m 

E [Y] = m/p Var [Y] = mq/p 2 My (a) = -^— ) 5y (s) = -^— ) (17.26) 

MATLAB: P (y = fc) = nbinom (m,p, fc) 

6. Poisson(/i). P (V = fc) = e~^ £ V fc > 

£ [X] = n Var [V] = /i M x (s) = e^ 6 *" 1 ) 9x (*) = e^ 8 " 1 ) (17.27) 

MATLAB: P (X = fc) = ipoisson (m, fc) P (V > fc) = cpoisson (m, fc) 



3 This content is available online at <http://cnx.Org/content/m23992/l.5/>. 
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17.3.2 Absolutely continuous distributions 

1. Uniform(a, b)fx (t) = 53^ a < t < b (zero elsewhere) 

r , b + a r , (b — a) , . 

S^ = -— v fa [X\ = ±—L M x (s) = — 17.28 

2 12 s (0 — a) 

(a+t) /a 2 -a<t < 

2. Symmetric triangular (— a, a) fx \t) = { 

(a-t) /a 2 < t < a 

2 as _i_ p -as _ o „as _ i i _ -as 

S[X]=0 Var[X]=^ M x (s) = + ^ 2 - = - • i— ?— (17.29) 

a z s z as as 

3. Exponential (A) f x (t) = Xe~ xt t > 



a **W = h ^W = x^. 



P[X] = T Var[X]=-^ Mx ( S ) = ^— (17.30) 



4. Gamma(a, A)/ x (t) = ^^f t > 



E[X]=j Var[X]=J M x («) = (^- ) (17.31) 



MATLAB: P (X < t) = gammadbn {a, A, t 

5. NormaliV( M , f 7 2 )/ x (i) = ^ e xp(-i(^) 



cxV 



P [X] = n Var [X] o- 2 M x (s) = exp ( + us I (17.32) 

MATLAB: P (X <t)= gaussian (/i, ct 2 , t) 

6. Beta(r, s) 

/* (*) = r r / r ^'V r " 1 (l - *) S_1 < i < 1, r > 0, « > (17.33) 

r(r)r(s) 

r t s 

E W = 7^- S Var W = ~ ,2, ~ — T (17-34) 

r + s (r + s) (r + s + 1) 

MATLAB: / x (i) = beta (r, s, t) P (X < t) = betadbn (r, 5, i) 

7. Weibull(a, A, v) 

F x (t) = l-e- x{t - v)a , a > 0, A > 0, v > 0, t > v (17.35) 

E[X] = ^T(1 + I /a) + v Var [X] = ^- [r (1 + 2/A) - T 2 (1 + 1/A)] (17.36) 

MATLAB: (v = only) 

/ x (£) = weibull (a, /, t) P (X < t) = weibulld (a, I, t) (17.37) 

17.3.3 Relationship between gamma and Poisson distributions 

• If X ~ gamma (n, A), then P (X < i) = P (Y > n) where V ~ Poisson (A£). 

• If Y ~ Poisson (Ai), then P (V > n) = P (AT < i) where AT ~ gamma (n, A). 
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17 A Appendix D to Applied Probability: The standard normal 
distribution 4 



*(*) 



?2tt ./_ 



e- u2/2 dt $(-*) = !-$(*) 



(17.38) 
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0.5199 
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0.5359 


0.1 


0.5398 


0.5438 


0.5478 


0.5517 


0.5557 


0.5596 


0.5636 


0.5675 


0.5714 


0.5753 


0.2 


0.5793 


0.5832 


0.5871 


0.5910 


0.5948 


0.5987 


0.6026 


0.6064 


0.6103 


0.6141 


0.3 


0.6179 


0.6217 


0.6255 


0.6293 


0.6331 


0.6368 


0.6406 


0.6443 


0.6480 


0.6517 
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0.6554 


0.6591 


0.6628 


0.6664 


0.6700 


0.6736 


0.6772 
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0.6844 


0.6879 
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0.6915 


0.6950 


0.6985 


0.7019 


0.7054 


0.7088 


0.7123 


0.7157 


0.7190 


0.7224 


0.6 


0.7257 


0.7291 


0.7324 


0.7357 


0.7389 
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0.7549 
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0.7643 
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0.8023 
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0.8106 
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0.8159 
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0.8238 
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0.8289 


0.8315 


0.8340 


0.8365 


0.8389 
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0.8413 


0.8438 


0.8461 


0.8485 
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0.8531 


0.8554 


0.8577 


0.8599 


0.8621 


1.1 


0.8643 


0.8665 


0.8686 


0.8708 


0.8729 


0.8749 


0.8770 


0.8790 


0.8810 


0.8830 


1.2 


0.8849 


0.8869 


0.8888 


0.8907 


0.8925 


0.8944 


0.8962 


0.8980 


0.8997 


0.9015 
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0.9049 


0.9066 
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0.9115 
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0.9147 
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0.9192 
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0.9429 


0.9441 


1.6 


0.9452 


0.9463 


0.9474 


0.9484 
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0.9505 
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0.9525 
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0.9545 
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0.9564 
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0.9608 


0.9616 
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4 This content is available online at <http://cnx.Org/content/m23995/l.5/>. 
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1.8 


0.9641 


0.9649 


0.9656 


0.9664 


0.9671 


0.9678 


0.9686 


0.9693 


0.9699 


0.9706 


1.9 


0.9713 


0.9719 


0.9726 


0.9732 


0.9738 


0.9744 


0.9750 


0.9756 


0.9761 


0.9767 
























2.0 


0.9772 


0.9778 


0.9783 


0.9788 


0.9793 


0.9798 


0.9803 


0.9808 


0.9812 


0.9817 


2.1 


0.9821 


0.9826 


0.9830 


0.9834 


0.9838 


0.9842 


0.9846 


0.9850 


0.9854 


0.9857 


2.2 


0.9861 


0.9864 


0.9868 


0.9871 


0.9875 


0.9878 


0.9881 


0.9884 


0.9887 


0.9890 


2.3 


0.9893 


0.9896 


0.9898 


0.9901 


0.9904 


0.9906 


0.9909 


0.9911 


0.9913 


0.9916 
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0.9918 


0.9920 


0.9922 


0.9925 


0.9927 


0.9929 


0.9931 


0.9932 


0.9934 


0.9936 
























2.5 


0.9938 


0.9940 


0.9941 


0.9943 


0.9945 


0.9946 


0.9948 


0.9949 


0.9951 


0.9952 


2.6 


0.9953 


0.9955 


0.9956 


0.9957 


0.9959 


0.9960 


0.9961 


0.9962 


0.9963 


0.9964 


2.7 


0.9965 


0.9966 


0.9967 


0.9968 


0.9969 


0.9970 


0.9971 


0.9972 


0.9973 


0.9974 


2.8 


0.9974 


0.9975 


0.9976 


0.9977 


0.9977 


0.9978 


0.9979 


0.9979 


0.9980 


0.9981 


2.9 


0.9981 


0.9982 


0.9982 


0.9983 


0.9984 


0.9984 


0.9985 


0.9985 


0.9986 


0.9986 
























3.0 


0.9987 


0.9987 


0.9987 


0.9988 


0.9988 


0.9989 


0.9989 


0.9989 


0.9990 


0.9990 

























Table 17.1 



17.5 Appendix E to Applied Probability: Properties of mathematical 
expectation 5 



E[g(X)} = J g(X)dP (17.39) 

We suppose, without repeated assertion, that the random variables and Borel functions of random variables 
or random vectors are integrable. Use of an expression such as Im {X) involves the tacit assumption that M 
is a Borel set on the codomain of X. 

(El) : E [a I a] = aP (A), any constant a, any event A 

(Ela) : E [I M {X)} = P {X e M) and E [I M (X) I N (Y)] = P {X e M,Y e N) for any Borel sets M, N 

(Extends to any finite product of such indicator functions of random vectors) 
(E2) : Linearity. For any constants a, b, E [aX + bY] = aE [X] + bE [Y] (Extends to any finite linear 

combination) 
(E3) : Positivity; monotonicity. 

a. X > a.s. implies E [X] > 0, with equality iff X = a.s. 

b. X > Y a.s. implies E [X] > E [Y], with equality iff X = Y a.s. 

(E4) : Fundamental lemma. If X > is bounded, and {X n : 1 < n} is a.s. nonnegative, nondecreasing, 
with lim n X n (uj) > X (w) for a.e. u>, then lim n E [X n ] > E [X] 



5 This content is available online at <http://cnx.Org/content/m23998/l.6/>. 
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(E4a): Monotone convergence. If for all n, < X n < X n+ \ a.s . and X n — > X a.s . , 
then E [X n ] -» £ [X] (The theorem also holds if £ [X] = oo) 

(E5) : Uniqueness. * is to be read as one of the symbols <, =, or > 

a. E [I M {X) g {X)\ * E [I M {X) h {X)] for all M iff g {X) * h {X) a.s. 

b. E [I M (X) I N (Z) g {X, Z)\ = E [I M {X) I N (Z) h {X, Z)\ for all M, N ittg(X, Z) = h {X, Z) a.s. 

(E6) : Fatou's lemma. If X n > a.s. , for all n, then E [lim infX n ] < lim infE [X n ] 

(E7) : Dominated convergence. If real or complex X n — > X a.s., \X n \ < Y a.s. for all n, and Y is 

integrable, then lim n E [X n ] = E [X] 
(E8) : Countable additivity and countable sums. 

oo 

a. If X is integrable over E, and E = \J Ei (disjoint union), then E [IeX] = YliLi E [lEtX] 

b. If E~ i E [\X n \] < oo, then E~ i ]x n \ < ooa.s. and E [E~ =1 X n ] = E~=i E [X n ] 
(E9) : Some integrability conditions 

a. X is integrable iff both X + and X are integrable iff \X\ is integrable. 

b. X is integrable iff E \l{\x\>a\\X[\ — » as a — > oo 

c. If X is integrable, then X is a.s. finite 

d. If E [X] exists and P {A) = 0, then E [I A X] = 

(E10): Triangle inequality. For integrable X, real or complex, \E [X] \ < E [\X\] 

(Ell): Mean- value theorem. If a < X < b a.s. on A, then aP (A) < E \I A X] < bP (A) 

(E12): For nonnegative, Borel g, E [g (X)} > aP (g (X) > a) 

(E13): Markov's inequality. If g > and nondecreasing for t > and a > 0, then 

g(a)P(\X\>a)<E[g(\X\)} (17.40) 

(E14): Jensen's inequality. If g is convex on an interval which contains the range of random variable X, 

then g (E [X]) < E [g (X)} 
(E15): Schwarz' inequality. For X, Y real or complex, \E [XY] | 2 < E [\X\ 2 ] E [\Y\ 2 ], with equality iff 

there is a constant c such that X = cY a.s. 
(E16): Holder's inequality. For 1 < p, q, with - + - = 1, and X, Y real or complex, 

E [\XY\] < E [\X\ p } 1/p E [\Y\ q ] 1,q (17.41) 

(E17): Minkowski's inequality. For 1 < p and X, Y real or complex, 

E [\X + Y\ p } 1/P < E [\X\ P } 1/P + E [\Y\ P } 1/P (17.42) 

(E18): Independence and expectation. The following conditions are equivalent. 

a. The pair {X, Y} is independent 

b. E [I M {X) I N (Y)] = E [I M {X)} E [I N (Y)] for all Borel M, N 

c. E [g {X) h (Y)] = E\g (X)] E [h (Y)] for all Borel g, h such that g(X), h (Y) are integrable. 

(E19): Special case of the Radon-Nikodym theorem If g (Y) is integrable and X is a random vector, 
then there exists a real- valued Borel function e ( • ), defined on the range of X, unique a.s. [Px], such 
that E [I M {X) g (Y)] = E [I M {X) e {X)] for all Borel sets M on the codomain of X. 

(E20): Some special forms of expectation 
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a. Suppose F is nondecreasing, right-continuous on [0, oo), with F (0~) = 0. Let F* (t) = F (t — 0). 
Consider X > with E [F {X)} < oo. Then, 

(1) E[F{X)} = P{X>t)F{dt) and (2) E[F*{X)} = P (X > t) F (dt) (17.43) 

Jo Jo 

b. If X is integrable, then E [X] = f^ [u (t) - F x (t)] dt 

c. If X, Y are integrable, then E [X - Y] = /^ [F Y (t) - F x (£)] dt 

d. If X > is integrable, then 

oo oo oo 

Y^ P (X > n + 1) < E [X] < Y^ P (X > n) < N ^ P (X > kN) , for all N > 1 (17.44) 

n=0 n=0 fe=0 

e. If integrable X > is integer- valued, then £ [X] = ^°° =1 P (X > n) = E°° =0 P {X > n)E [X 2 ] = 
EZA2n-l)P(X>n) = j:Zo(2n+l)P(X>n) 

f. If Q is the quantile function for Fx, then E [g (X)] = J g[Q (w)] du 



17.6 Appendix F to Applied Probability: Properties of conditional 
expectation, given a random vector 6 

We suppose, without repeated assertion, that the random variables and functions of random vectors are 
integrable, as needed. 

(CE1): Defining condition. e{X) = E[g{Y)\X] a.s. iff E[I M {X)g{Y)\ = E [I M (X) e (X)} for each 

Borel set M on the codomain of X. 
(CEla): If P (X € M) > 0, then E [I M {X) e {X)} = E[g (Y) \X e M] P {X e M) 
(CElb): Law of total probability. E [g (Y)] = E{E [g (Y) \X}} 
(CE2): Linearity. For any constants a, b 

E[ag(Y)+bh(Z)\X]=aE[g(Y)\X]+bE[h(Z)\X] a.s. 

(Extends to any finite linear combination) 
(CE3): Positivity; monotonicity. 

a- g (Y) > a.s. implies E [g (Y) \X] > a.s. 

b. g(Y)>h (Z) a.s. implies E [g (Y) \X] > E [h (Z) \X] a.s. 

(CE4): Monotone convergence. Y n — > Y a.s. monotonically implies S[Y„|X] — » E [Y\X] a.s. 
(CE5): Independence. {X, Y} is an independent pair 

• iff E [g (Y) \X] = E[g (Y)} a.s. for all Borel functions g 

• iff E [I N (Y) \X] = E [I N (Y)] a.s. for all Borel sets JV on the codomain of Y 

(CE6): e (X) = E [g (Y) \X] a.s. iff E [h {X) g (Y)} =E[h {X) e {X)] a.s. for any Borel function h 
(CE7): E [h {X) \X] = h {X) a.s. for any Borel function h 
(CE8): E [h (X) g (Y) \X] = h(X)E [g (Y) \X] a.s. for any Borel function h 
(CE9): UX = h (W), then E{E [g (Y) \X] \W} = E{E [g (Y) \W] \X} = E[g (Y) \X], a.s. 
(CE9a): E{E [g (Y) \X] \X, Z} = E{E [g (Y) \X, Z] \X} = E[g (Y) \X] a.s. 

(CE9b): If X = h (W) and W = k {X), with h, k Borel functions, then E [g (Y) \X] = E[g (Y) \W] a.s. 
(CE10): If g is a Borel function such that E [g (t, Y)} is finite for all t on the range of X and E [g (X, Y)} 
is finite, then 



a, E [g (X, Y)\X = t] = E [g (t, Y) \X = t] a.s. [P x ] 



6 This content is available online at <http://cnx.Org/content/m24001/l.6/>. 
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b. If {X, Y} is independent, then E [g {X, Y)\X = t] = E [g {t, Y)] a.s. [P x ] 

(CE11): Suppose {X (t) : t € T} is a real-valued, measurable random process whose parameter set T is a 
Borel subset of the real line and S is a random variable whose range is a subset of T, so that X (S) is 
a random variable. 
If E [X (£)] is finite for all t in T and E [X (S)] is finite, then 

a. E [X (S) \S = t] = E[X (t) \S = t] a.s. [P s ] 

b. If, in addition, {S, X T } is independent, then E [X (S) \S = t}= E[X (£)] a.s. [P s ] 

(CE12): Countable additivity and countable sums. 

oo 

a. If Y is integrable on A and A = V A n , 

then E \I A Y\X] = £~ =1 E \I A Y\X] a.s. 
b- H E~ i E [\Y n \] < oo , then E [E~ i Y n \X\ = £~ , E [Y n \X] a.s. 

(CE13): Triangle inequality. \E \g (Y) \X]\ < E [\g (Y) \ \X] a.s. 

(CE14): Jensen's inequality. If g is a convex function on an interval I which contains the range of a real 

random variable Y, then g{E [Y\X]} < E [g (Y) \X] a.s. 
(CE15): Suppose E[\Y\p] < oo and E[\Z\p] < oo for 1 < p < oo. Then 

E{\E [Y\X] - E [Z\X] | p } <E[\Y- Z\p] < oo 



17.7 Appendix G to Applied Probability: Properties of conditional 
independence, given a random vector 7 

Definition. The pair {X, Y} is conditionally independent, givenZ, denoted {X, Y}ci \Z iff 

E [I M {X) I N (Y) \Z] = E [I M {X) \Z] E [I N (Y) \Z] a.s. for all Borel sets M, N (17.45) 

An arbitrary class {X t : t e T} of random vectors is conditionally independent, give Z, iff such a product 
rule holds for each finite subclass or two or more members of the class. 

Remark. The expression "for all Borel sets M, N," here and elsewhere, implies the sets are on the 
appropriate codomains. Also, the expressions below "for all Borel functions g," etc., imply that the functions 
are real-valued, such that the indicated expectations are finite. 

The following are equivalent. Each is necessary and sufficient that {X, Y} ci \Z. 

(CI1) : E [I M {X) I N (Y) \Z] = E [I M {X) \Z] E [I N (Y) \Z] a.s. for all Borel sets M, N 
(CI2) : E [I M (X) \Z, Y} = E [I M (X) \Z] a.s. for all Borel sets M 
(CI3) : E [I M (X) I Q (Z) \Z, Y] = E [I M {X) I Q (Z) \Z] a.s. for all Borel sets M, Q 
(CI4) : E [I M (X) I Q (Z) \Y] = E{E [I M (X) I Q (Z) \Z] \Y} a.s. for all Borel sets M, Q 

(CI5) : E [g {X, Z) h (Y, Z) \Z] =E[g {X, Z) \Z] E [h (Y, Z) \Z] a.s. for all Borel functions g, h 
(CI6) : E [g (X, Z) \Z, Y]=E[g (X, Z) \Z] a.s. for all Borel functions g 
(CI7) : For any Borel function g, there exists a Borel function e g such that 

E [g (X, Z) \Z, Y] = e g (Z) a.s. (17.46) 

(CI8) : E [g {X, Z) \Y] = E{E [g {X, Z) \Z] \Y) a.s. for all Borel functions g 



7 This content is available online at <http://cnx.Org/content/m24003/l.6/>. 
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(CI9) : {U, V} ci \Z, where U = g(X, Z) and V = h (Y, Z), for any Borel functions g, h. 

Additional properties of conditional independence 

(CI10): {X, Y}ci\Z implies {X, Y}ci\{Z, U), {X 7 Y}ci\(Z, V), and {X, Y}ci\{Z, U, V), where U = 

h {X) and V = k (Y), with h, k Borel. 
(CI11): {X, Z}ci\Y and {X, W}ci\{Y, Z) iff {X, {Z,W)}ci\Y. 
(CI12): {X, Z}ci\Y and {(X, Y) , W}ci \Z implies {X, {Z,W)}ci\Y. 
(CI13): {X, Y} is independent and {X, Z}ci \Y iff {X, (Y, Z)} is independent. 
(CI14): {X, Y}ci \Z implies E [g {X, Y) \Y = u, Z = v] = E [g {X, u) \Z = v] a.s. [P YZ ] 
(CI15): {X, Y}ci | Z implies 

a. E [g (X, Z) h (Y, Z)\ = E{E [g (X, Z) \Z\ E [h (Y, Z) \Z}} = E [ei (Z) e 2 (Z)] 

b. E [g (Y) \X G M] P (X G M) = E{E [I M (X) \Z] E [g (Y) \Z}} 

(CI16): {(X,Y),Z}ci\W iff E[I M (X) I N (Y) I Q (Z)\W] = E [I M (X) I N (Y) \W] E [I Q (Z) \W] a.s. 
for all Borel sets M, TV", Q 



17.8 Matlab files for "Problems" in "Applied Probability"' 
17.8.1 npr02_04 

°/.~file~npr02_04.m 
7„~Data~for~problem~P2-4 
pm~=~[0.0168~~0.0392~~0.0672~~0.1568~~0.0072~~0.0168~~0.0288~~0.0672~. . . 

0.0252~~0.0588~~0.1008~~0.2352~~0.0108~~0.0252~~0.0432~~0.1008] ; 

disp('Minterm~probabilities~are~in~pm. ~~Use~mintable(4) ' ) 



17.8.2 npr02_05 

7„~file~npr02_05.m 
y„~Data~for~problem~P2-5 
pm~=~[0.0216~~0.0144~~0.0504~~0.0336~~0.0324~~0.0216~~0.0756~~0.0504~~0.0216- 

0.0144~~0.0504~~0.0336~~0.0324~~0.0216~~0.0756~~0.0504~~0.0144~~0.0096~ 

0.0336~~0.0224~~0.0216~~0.0144~~0.0504~~0.0336~~0.0144~~0.0096~~0.0336~ 



0.0224~~0.0216~~0.0144~~0.0504~~0.0336] ; 

disp('Minterm~probabilities~are~in~pm. ~~Use~mintable(5) ' ) 



17.8.3 npr02_06 

'/.~file~npr02_06.m 
7„~Data~for~problem~P2-6 
minvec3 
DV~=~ [A I Ac ; ~A I (Bc&C) ; ~A&C ; ~Ac&B ; "Ac&Cc ; ~B&Cc] ; 

DP~=~[1 0.65 0.20~0.25~~0.25~~~0.30] ; 

TV~=~ [ ( (A&Cc) I (Ac&C) ) &Bc ; ~ ( (A&Bc) I Ac) &Cc ; ~Ac& (B I Cc) ] ; 
disp ( ' Call~f or~mincalc ' ) 



s This content is available online at <http://cnx.Org/content/m24179/l.3/>. 
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17.8.4 npr02_07 

y.~file~npr02_07.m 
y„~Data~for~problem~P2-7 
minvec3 
DV~=~[A|Ac;~((A&Bc) I (Ac&B) )&C; ~A&B; ~Ac&Cc; ~~A; ~~C; ~A&Bc&Cc] ; 

DP ~=~[~1 o.4 0.2~~~0.3~~~0.6~0.5~~~0.1]; 

TV~=~ [ (Ac&Cc) I (A&C) ; ~ ( (A&Bc) I Ac) &Cc ; ~Ac& (B I Cc) ] ; 
disp ( ' Call~f or~mincalc ' ) 

17.8.5 npr02_08 

7„~file~npr02_08.m 
7„~Data~for~problem~P2-8 
minvec3 
DV~=~ [A I Ac ; ~A ; ~~C ; ~~A&C ; ~Ac&B ; ~Ac&Bc&Cc] ; 

D p~=~[~l~~~0.6~0.4~~0.3~~0.2 0.1] ; 

TV~=~ [ (A I B) &Cc ; ~ (A&Cc) I (Ac&C) ; ~ (A&Cc) I (Ac&B) ] ; 
disp ( ' Call~f or~mincalc ' ) 

17.8.6 npr02_09 

y„~file~npr02_09.m 
7„~Data~for~problem~P2-9 
minvec3 
DV~=~ [A I Ac ; ~~A ; ~A&B; ~A&C; ~A&B&Cc] ; 

DP ~=~[~1 0.5~0.3~~0.3~~~0.1] ; 

TV~=~[A&(~(B&Cc));~(A&B) I (A&C) I (B&C)] ; 
disp ( ' Call~f or~mincalc ' ) 

°/.~Modif ication~f or "part ~2 
7„~DV~=~ [DV ; ~ Ac&Bc&Cc ; ~ Ac&B&C] ; 
y„~DP~=~[DP~0.1~0.05] ; 

17.8.7 npr02_10 

°/.~file~npr02_10.m 
%~Data~for~problem~P2-10 
minvec4 
DV~=~ [A I Ac ; ~~A ; ~~Ac&Bc ; ~A&Cc ; ~A&C&Dc] ; 

DP~=~[1 0.6~~0.2 0.4 0.1] ; 

TV~=~[(Ac&B) I (A&(Cc|D))] ; 
disp ( ' Call~f or~mincalc ' ) 
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17.8.8 npr02_ll 

7.~file~npr02_ll.m 
7.~Data~for~problem~P2-ll 

°/.~A~=~male; ~~B~=~on~campus; ~~C~=~active~in~sports 
minvec3 
DV~=~ [A I Ac ; ~~A ; ~~~B; ~~A I C ; ~B&Cc ; "A&B&C; ~A&Bc ; ~A&Cc] ; 

D p~=~[~l 0.52~0.85~0.78~0.30~~0.32~~~0.08~0.17] ; 

TV~=~ [A&B ; "A&B&Cc ; ~Ac&C] ; 
disp ( ' Call~f or~mincalc ' ) 

17.8.9 npr02_12 

°/.~file~npr02_12.m 
7.~Data~for~problem~P2-12 

°/.~A~=~male; ~~B~=~party "member; ~C~=~voted~last~election 
minvec3 

DV~=~ [A I Ac ; ~~A ; ~~A&Bc ; ~~B ; ~~Bc&C ; ~Ac&Bc&C] ; 
DP~=~[~~1~~~0.60~0.30~~0.50~0.20~~0.10] ; 
TV~=~ [Bc&Cc] ; 
disp ( ' Call~f or~mincalc ' ) 

17.8.10 npr02_13 

°/.~file~npr02_13.m 
7.~Data~for~problem~P2-13 

7o~A~=~rain~in~ Austin; ~~B~=~rain~in~Houston; 
7.~C~=~rain~in~San~ Antonio 
minvec3 
DV~=~ [A I Ac ; ~ A&B ; ~ A&Bc ; ~ A&C ; ~ (A&Bc) I ( Ac&B) ; ~B&C ; ~Bc&C ; ~ Ac&Bc&Cc] ; 

DP~=~[~~1~~~0.35~0.15~~0.20 0.45 . 30~0 . 05~~~0 . 15] ; 

TV~=~[A&B&C;~(A&B&Cc) I (A&Bc&C) I (Ac&B&C) ; ~ (A&Bc&Cc) I (Ac&B&Cc) I (Ac&Bc&C)] ; 
disp ( ' Call~f or~mincalc ' ) 

17.8.11 npr02_14 

7.~file~npr02_14.m 
7.~Data~for~problem~P2-14 
7.~A~=~male; ~~B~=~engineering; 
7.~C~=~f oreign~language ; ~D~=~graduate~study 
minvec4 
DV~=~ [A I Ac ; ~A ; ~B ; "Ac&B ; ~C ; ~Ac&C ; ~A&D ; ~Ac&D ; "A&B&D ; ~ . . . 

Ac&B&D;~B&C&D;~Bc&Cc&D;~Ac&Bc&C&D] ; 

D p~=~[l~0.55~0.23~0.10~0.75~0.45~0.26~0.19~0.13~0.08~0.20~0.05~0.11] ; 
TV~=~[C&D;~Ac&Dc;~A&((C&Dc) I (Cc&D))] ; 
disp ( ' Call~f or~mincalc ' ) 
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17.8.12 npr02_15 

7,~file~npr02_15.m 
7.~Data~for~problem~P2-15 

7.~A~=~men; ~B~=~on~ campus ; ~C~=~readers ; ~D~=~active 
minvec4 
DV~=~[A|Ac;~A;~~B;~~Ac&B;~~C;~~Ac&C;~~D;~~B&D;~C&D;~. . . 

Ac&B&D ; ~Ac&Bc&D ; "Ac&B&C&D ; "Ac&Bc&C&D ; ~A&Bc&Cc&D] ; 

DP~=~[l~~0.6~0.55~0.25~0.40~0.25~0.70~0.50~0.35~0.25~0.05~0.10~0.05~0.05] ; 
TV~=~[A&D&(Cc|Bc);~A&Dc&Cc] ; 
disp ( ' Call~f or~mincalc ' ) 

17.8.13 npr02_16 

7.~file~npr02_16.m 
7.~Data~for~problem~P2-16 
minvec3 

DV~=~[A|Ac;~A; B; C;~(A&B) I (A&C) I (B&C) ; ~A&B&C;~A&C;~ (A&B) -2*(B&C)] ; 

D p~=~[~l~~0.221~0.209~0.112~~~0.197 0.045~~0.062 0] ; 

TV~=~[A|B|C;~(A&Bc&Cc) I (Ac&B&Cc) I (Ac&Bc&C)] ; 
disp ( ' Call~f or~mincalc ' ) 

17.8.14 npr02_17 

°/.~file~npr02_17.m 
7.~Data~for~problem~P2-17 

°/ ~A~=~alignment ; ~~B~=~brake~work; C~=~headlight 
minvec3 

DV~=~[A|Ac;~A&B&C;~(A&B) I (A&C) I (B&C);~B&C; A~~] ; 

DP ~=~[~1 0.100 0.325 0.125~0.550] ; 

TV~=~ [A&Bc&Cc ; ~Ac& (~ (B&C) ) ] ; 
disp ( ' Call~f or~mincalc ' ) 

17.8.15 npr02_18 

7~file~npr02_18.m 
7„~Date~for~problem~P2-18 
minvec3 
DV~=~ [A I Ac ; ~A& (B I C) ; "Ac ; ~Ac&Bc&Cc] ; 

DP~=~[~1 0.3 0.6 0.1] ; 

TV~=~[B|C;~(((A&B) I (Ac&Bc))&Cc) I (A&C) ;~Ac&(B I Cc)] ; 
disp ( ' Call~f or~mincalc ' ) 

7.~Modif ication 

7.~DV~=~ [DV ; ~ Ac&B&C ; ~ Ac&B] ; 

'/.~DP~=~[DP~~~0.2 0.3] ; 
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17.8.16 npr02_19 

y„~file~npr02_19.m 
7„~Data~for~problem~P2-19 

°/.~A~=~computer ;~~B~=~monitor ;~~C~=~printer 
minvec3 
DV~=~[A|Ac;~A&B;~A&B&Cc;~A&C;~B&C;~(A&Cc) I (Ac&C);~. . . 

(A&Bc) I (Ac&B) ;~(B&Cc) I (Bc&C)] ; 

DP~=~[1~0.49~0.17~0.45~0.39~0.50~0.43~0.43]; 

TV~=~[A;~B;~C;~(A&B&Cc) I (A&Bc&C) I (Ac&B&C) ; ~(A&B) I (A&C) I (B&C) ; ~A&B&C] ; 

disp ( ' Call~f or~mincalc ' ) 

17.8.17 npr02_20 

°/.~file~npr02_20.m 
y„~Data~for~problem~P2-20 
minvec3 

DV~=~[A|Ac;~A; B; ~~A&B&C;~A&C;~ (A&B) I (A&C) I (B&C) ; ~B&C~-~2*(A&C)] ; 

DP~=~[~~1~~0.232~0.228~0.045~0.062 0.197 0] ; 

TV~=~ [A I B I C ; ~Ac&Bc&C] ; 
disp ( ' Call~f or~mincalc ' ) 
/.~Modif ication 
•/.~DV~-~[DV;~C]; 
'/.~DP~-~[DP~~0.230~]; 

17.8.18 npr02_21 

°/.~file~npr02_21.m 
y„~Data~for~problem~P2-21 
minvec3 

DV~=~ [A I Ac ; ~A ; ~~A&B ; "A&B&C ; ~~C ; ~~Ac&Cc] ; 
DP ~=~[~1~~~0.4 0.3~~0.25~~~0.65~~0.3~] ; 
TV~=~ [ (A&Cc) I (Ac&C) ; ~Ac&Bc ; ~A I B ; ~A&Bc] ; 
disp ( ' Call~f or~mincalc ' ) 
°/.~Modif ication 
°/.~DV~=~ [DV ; ~ Ac&B&Cc ; ~ Ac&Bc] ; 
°/.~DP~=~[DP~~~0.1 0.3~] ; 

17.8.19 npr02_22 

°/.~file~npr02_22.m 
y„~Data~for~problem~P2-22 
minvec3 

DV~=~ [A I Ac ; ~A ; ~~A&B ; "A&B&C ; ~~C ; ~~Ac&Cc] ; 
DP~=~[~l~~~0.4 0.5~~0.25~~~0.65~~0.3~] ; 
TV~=~ [ (A&Cc) I (Ac&C) ; ~Ac&Bc ; ~A I B ; ~A&Bc] ; 
disp ( ' Call~f or~mincalc ' ) 
°/.~Modif ication 
°/.~DV~=~ [DV ; ~ Ac&B&Cc ; ~ Ac&Bc] ; 
°/.~DP~=~[DP~~~0.1 0.3~] ; 
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17.8.20 npr02_23 

7,~file~npr02_23.m 
7.~Data~for~problem~P2-23 
minvec3 

DV~=~ [A I Ac ; ~A ; ~~A&C ; "A&B&C ; ~~C ; ~~Ac&Cc] ; 
DP ~=~[~1~~~0.4 0.3~~0.25~~~0.65~~0.3~] ; 
TV~=~ [ (A&Cc) I (Ac&C) ; "Ac&Bc ; ~A I B ; ~A&Bc] ; 
disp ( ' Call~f or~mincalc ' ) 
°/.~Modif ication 
°/.~DV~=~ [DV ; ~ Ac&B&Cc ; ~ Ac&Bc] ; 
°/.~DP~=~[DP~~~0.1 0.3~] ; 



17.8.21 npr03_01 

7„~file~npr03_01.m 
7.~Data~for~problem~P3-l 
minvec3 
DV~=~[A|Ac;~A;~~A&B;~B&C;~Ac| (B&C) ; ~Ac&B&Cc] ; 

DP~=~[~1~~~0.55~0.30~0.20~~~0.55 0.15~~] ; 

TV~=~[Ac&B;~B] ; 

disp ( ' Call~f or~mincalc ' ) 



17.8.22 npr04_04 

°/.~file~npr04_04.m 
7„~Data~for~problem~P4-4 

pm~=~ [0 . 032~0 . 016~0 . 376~0 . 011~0 . 364~0 . 073~0 . 077~0 . 051] ; 
disp('Minterm~probabilities~f or~P4-4~are~in~pm' ) 



17.8.23 npr04_05 

y„~file~npr04_05.m 
7„~Data~for~problem~P4-5 
pm~=~ [0 . 084~0 . 196~0 . 036~0 . 084~0 . 085~0 . 196~0 . 035~0 . 084~ . . . 

0.021~0.049~0.009~0.021~0.020~0.049~0.010~0.021] ; 

disp('Minterm~probabilities~f or~P4-5~are~in~pm' ) 



17.8.24 npr04 06 



7„~file~npr04_06.m 
7„~Data~for~problem~P4-6 
pm~=~~ [0 . 085~0 . 195~0 . 035~0 . 085~0 . 080~0 . 200~0 . 035~0 . 085" . . 

0.020~0.050~0.010~0.020~0.020~0.050~0.015~0.015] ; 

disp('Minterm~probabilities~f or~P4-6~are~in~pm' ) 



609 



17.8.25 mpr05_16 

7,~f ile~mpr05_16.m 
y„~Data~for~Problem~P5-16 
A~=~[51~26~~7;~42~32~10;~19~54~ll;~24~53~~7;~27~52~~5; 

49~19~16;~16~59~~9;~47~32~~5;~55~17~12;~24~53~~7] ; 

B~=~[27~34~~5;~19~43~~4;~39~22~~5;~38~19~~9;~28~33~~5; 

19~41~~6;~37~21~~8;~19~42~~5;~27~33~~6;~39~21~~6] ; 

disp ( ' Call~f or~oddsdf ' ) 



17.8.26 npr05_17 

7„~file~npr05_17.m 
7„~Data~for~problem~P5-17 
PG1~=~84/150; 
PG2~=~66/125; 
A~=~[0.61~0.31~0.08 

0.50~0.38~0.12 

0.23~0.64~0.13 

0.29~0.63~0.08 

0.32~0.62~0.06 

0.58~0.23~0.19 

0.19~0.70~0.11 

0.56~0.38~0.06 

0.65~0.20~0.15 

0.29~0.63~0.08] ; 

B~=~[0.41~0.51~0.08 

0.29~0.65~0.06 

0.59~0.33~0.08 

0.57~0.29~0.14 

0.42~0.50~0.08 

0.29~0.62~0.09 

0.56~0.32~0.12 

0.29~0.64~0.08 

0.41~0.50~0.09 

0.59~0.32~0.09] ; 

disp ( ' Call~f or~oddsdp ' ) 



17.8.27 npr06_10 

%~file~npr06_10.m 
'/„~Data~for~problem~P6-10 
pm~=~ [~0 . 072~0 . 048~0 . 018~0 . 012~0 . 168~0 . 112~0 . 042~0 . 028~ . . . 

. 062~0 . 048~0 . 028~0 . 010~0 . 170~0 . 110~0 . 040~0 . 032] ; 

c ~~=~ [_5 . 3~-2 . 5~2 . 3~4 . 2~-3 . 7] ; 

disp ( ' Minterm~probabilities~are~in~pm, ~coef f icients~in~c ' ) 
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17.8.28 npr06_12 

°/„~file~npr06_12.m 
y„~Data~for~problem~P6-12 

pm~=~0 . 001* [5~7~6~8~9~14~22~33~21~32~50~75~86~129~201~302] ; 
c~=~[l~l~l~l~0]; 
disp('MirLterm~probabilities~in~pm,~coef f icients~in~c' ) 

17.8.29 npr06_18.m 

°/~file~npr06_18.m 
'/„~Data~for~problem~P6-18 
cx~=~[5~17~21~8~15~0] ; 
cy~=~[8~15~12~18~15~12~0] ; 
pmx~=~minprob(0.01*[37~22~38~81~63]) ; 
pmy~=~minprob(0 . 01* [77~52~23~41~83~58] ) ; 
disp('Data~in~cx, ~cy , ~pmx, ~pmy ' ) 

17.8.30 npr07 01 



\begin{ verbat im} 

'/. file npr07_01.m 

'/, Data for problem P7-1 

T =[132342135 2]; 

pc = 0.01*[ 8 13 6 9 14 11 12 7 11 9]; 

disp('Data are in T and pc') 

\end{verbat im} 



17.8.31 npr07 02 



'/. file npr07_02.m 

'/. Data for problem P7-2 

T = [3.5 5.0 3.5 7.5 5.0 5.0 3.5 7.5]; 

pc = 0.01* [10 15 15 20 10 5 10 15]; 

disp('Data are in T, pc') 



17.8.32 npr08_01 

y„~file~npr08_01.m 
7,~Solution~f or~problem~P8-l 
X~=~0:2; 
Y~=~0:2; 

Pn~=~ [132~~24~~~0 ; ~864~144~~6; ~1260~216~6] ; 
P~=~Pn/(52*51); 
disp ( ' Data~in~Pn , ~P , ~X , ~Y ' ) 
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17.8.33 npr08_02 

7.~file~npr08_02.m 
7.~Solution~for~problem~P8-2 
X~=~0:2; 
Y~=~0:2; 

Pn~=~[6~0~0;~18~12~0;~6~12~2] ; 
P~=~Pn/56; 
disp ( ' Data~are~in~X , ~Y , Pn , ~P ' ) 

17.8.34 npr08_03 

y„~file~npr08_03.m 
7,~Solution~f or "problem" ~P8-3 
X~=~l:6; 
Y~=~0:6; 

P0~=~zeros(6,7) ; °/,~ Initialize 
f or~i~=~l :6 7.~Calculate~rows~of~Y~probabilities 

P0(i,l:i+l)~=~(l/6)*ibinom(i,l/2,0:i); 

end 

P~=~rot90 (PO) ; 7.~Rotate~to~orient~as~on~the~plane 

PY~=~f liplr (sum(P' )) ; ~~°/.~Reverse~to~put~in~normal~order 
disp(' Answers~are~in~X, ~Y, ~P,~PY' ) 

17.8.35 npr08_04 

7„~file~npr08_04.m 
7„~Solution~for~problem~P8-4 
X~=~2:12; 
Y~=~0:12; 

PX~=~(1/36)*[1~2~3~4~5~6~5~4~3~2~1] ; 
P0~=~zeros(ll,13) ; 
for~i~=~l:ll 

P0(i,l:i+2)~=~PX(i)*ibinom(i+l,l/2,0:i+l); 

end 

P~=~rot90(P0); 

PY~=~f liplr (sum(P')) ; 

disp ( ' Answer s~are~in~X , ~Y , ~PY , ~P ' ) 

17.8.36 npr08_05 

°/.~file~npr08_05.m 
7,~Data~and~basic~calculations~f or~P8-5 
PX~=~(1/36)*[1~2~3~4~5~6~5~4~3~2~1] ; 
X~=~2:12; 
Y~=~0:12; 

P0~=~zeros(ll,13) ; 
for~i~=~l:ll 
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~~P0 (i , 1 : i+2) ~=~PX (i) *ibinom(i+l , 1/6 , : i+1) ; 

end 

P~=~rot90(P0); 

PY ~=~f iipi r (sum(P ' ) ) ; 

disp ( ' Answers~are~in~X , ~Y , ~P , ~PY ' ) 



17.8.37 npr08 06 



'/.~f ile~~Newprobs/pr08_06 .m 
'/.~Data~f or~problem~P8-6~ (f rom~Exam~2 , ~95f ) 

P~=~ [0 . 0483 . 0357 . 0420 . 0399~' 

0.0437 0.0323 0.0380 0.0361~' 

0.0713 0.0527 0.0620 0.0609~' 

0.0667 0.0493 0.0580 0.0651"" 

X~=~[-2.3~-0.7~l.l~3.9~5.1] ; 
Y~=~[~i.3~~2.5~4.1~5.3] ; 
disp ( ' Data~are~in~X , ~Y , ~P ' ) 



"0.0441 
"0.0399 
"0.0551 
"0.0589]; 



17.8.38 npr08 07 



%~f ile~pr08_07 .m~~ (f rom~Exam3 , ~96s) 
y„~Data~for~problem~P8-7 
X~=~ [-3 . l~-0 . 5~~1 . 2~~2 . 4~~3 . 7~4 . 9] ; 
Y~=~[-3.8~-2.0~~4.1~~7.5] ; 

P~=~ [~0.0090 0.0396 0.0594 0.0216" 

0.0495 0.1089 0.0528" 

0.0405 0.1320 0.0891 0.0324" 

0.0510 0.0484 0.0726 0.0132" 

disp ( ' Data~are~in~X , ~Y , ~P ' ) 



"0.0440" 
"0.0363" 
"0.0297" 
0" 



"0.0203 
"0.0231 
"0.0189 
"0.0077] 



17.8.39 npr08 08 



'/rfile~Newprobs/pr08_08.m~(from~Exam~4~96s) 
y„~Data~for~problem~P8-8 






0156~ 


~0 


0191~ 


~0 


0081~ 


~0 


0035~ 


~0 


0091~ 


~0 


0070~ 


~0 


0098~ 


~0 


0056~ 


~0 


0091~ 


~0 


0049 







0064~ 


~0 


0204~ 


~0 


0108~ 


~0 


0040~ 


~0 


0054~ 


~0 


0080~ 


~0 


0112~ 


~0 


0064~ 


~0 


0104~ 


~0 


0056 







0196~ 


~0 


0256~ 


~0 


0126~ 


~0 


0060~ 


~0 


0156~ 


~0 


0120~ 


~0 


0168~ 


~0 


0096~ 


~0 


0056~ 


~0 


0084 







0112~ 


~0 


0182~ 


~0 


Clos- 


~0 


0070~ 


~0 


0182~ 


~0 


0140~ 


~0 


0196~ 


~0 


0012~ 


~0 


0182~ 


~0 


0038 







0060~ 


~0 


0260~ 


~0 


er - 


~0 


0050~ 


~0 


0160~ 


~0 


0200~ 


~0 


0280~ 


~0 


0060~ 


~0 


0160~ 


~0 


0040 







0096~ 


~0 


0056~ 


~0 


0072~ 


~0 


0060~ 


~0 


0256~ 


~0 


0120~ 


~0 


0268~ 


~0 


0096~ 


~0 


0256~ 


~0 


0084 







0044~ 


~0 


0134~ 


~0 


0180~ 


~0 


0140~ 


~0 


0234~ 


~0 


0180~ 


~0 


0252~ 


~0 


0244~ 


~0 


0234~ 


~0 


0126 







0072~ 


~0 


0017~ 


~0 


0063~ 


~0 


0045~ 


~0 


0167~ 


~0 


0090~ 


~0 


0026~ 


~0 


0172~ 


~0 


0217~ 


~0 


0223: 


; 



X~=~l:2:19; 

Y~=~[-5~~-3~~-l~~3~~5~9~10~12] ; 
disp ( ' Data~are~in~X , ~Y , ~P ' ) 



17.8.40 npr08 09 
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'/.~file~pr08_09.m~~~ 


(f rom~Exam3 


~95f) 








°/„~Data~for~problem~P8-9 










P~=~ [0.0390 


"0.0110~~ 


~~0.0050~~~ 


~0.0010~~ 


~~0 


0010 




0.0650 


"0.0700~~ 


~~0.0500~~~ 


~0.0150~~ 


~~0 


0100 




0.0310 


~0.0610~~ 


~~0.1370~~~ 


~0.0510~~ 


~~0 


0330 




0.0120 


"0.0490~~ 


~~0.1630~~~ 


~0.0580~~ 


~~0 


0390 




0.0030 


~0.0090~~ 


~~0.0450~~~ 


~0.0250~~ 


~~0 


0170]; 


X~=~[l~1.5~2~2 


5~3]; 










Y~=~[l~2~3~4~5: 


; 










disp ( ' Data~are' 


"in~X,~Y, 


~P') 











17.8.41 npr09 02 



\begin{ verbat im} 

'/„ file Newprobs/npr09_02.m 

'/. Data for problem P9-2 



P = [0.0589 0.0342 


0.0304 





0456 





0209; 


0.0961 0.0556 


0.0498 





0744 





0341; 


0.0682 0.0398 


0.0350 





0528 





0242; 


0.0868 0.0504 


. 0448 





0672 





0308] ; 


X = [-3.9 -1.7 1.5 2.8 


4.1]; 










Y = [-2 1 2.6 5.1] ; 












disp ('Data are in X, Y 


P') 










\end{verbatim} 













17.8.42 nprlO 16 



\begin{ verbat im} 

'/. file nprl0_16.m 

'/. Data for problem P10-16 

ex =[-2130]; 

pmx = 0.001* [255 25 375 45 108 12 162 18]; 

cy = [1 3 1 -3] ; 

pmy = minprob(0.01*[32 56 40]); 

Z = [-1.3 1.2 2.7 3.4 5.8] ; 

PZ = 0.01* [12 24 43 13 8] ; 

disp ('Data are in ex, pmx, cy, pmy, Z, PZ') 

\end{verbatim} 



17.8.43 nprl2 10 



614 



CHAPTER 17. APPENDICES 



'/. file nprl2_10.m 
'/. Data for problems P12-10, P12_ll 
ex = [-3.3 -1.7 2.3 7.6 -3.4]; 
pmx = 0.0001* [475 725 120 180 1125 1675 
cy = [10 17 20 -10]; 
pmy = 0.01* [6 14 9 21 6 14 9 21]; 
disp('Data are in ex, cy, pmx and pmy') 



280 420 480 720 130 170 1120 1680 270 430] ; 
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\begin{ verbat im} 

'/. file nprl6_07.m 

'/, Transition matrix for problem P16-7 



P = 



[0.23 


0.32 


0.02 


0.22 


0.21 




0.29 


0.41 


0.10 


0.08 


0.12 




0.22 


0.07 


0.31 


0.14 


0.26 




0.32 


0.15 


0.05 


0.33 


0.15 




0.08 


0.23 


0.31 


0.09 


0.29: 


; 



disp(' Transition matrix is P') 
\end{verbat im} 
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'/„~file~nprl6_09.m 

°/.~Transition~matrix~f or~problem~P16-9 
P~=~[0.2~0.5~0.3' 



~0.6~0.1' 
~0.2~0.7- 
~~0~~~0~" 
~~0~~~0~" 
~0.1~0.3- 



0.3' 

0.1' 
~0~' 
~0~' 
~0~' 



~0~ 
~0~ 
~0~ 

o.e 



~~o~~ 
-~o~~ 
-~o~~ 

"0.4~ 



"0.5~0.5' 
"0.2~0.1- 



0.1~0.2; 



0.1~0.2~0.1~0.2~0.2~0.2~~0~] ; 

disp ( ' Transit ion~matrix~is~P ' ) 
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Applied Probability 

The present collection utilizes a number of user defined m-programs, in combination with built in MATLAB 
functions, for solving a variety of probabilistic problems. These m-files are included as text files in the 
collection New Prob m-files. We use the term m-function to designate a user-defined function as distinct 
from the basic MATLAB functions which are part of the MATLAB package. An m-procedure (or sometimes 
a procedure) is an m-file containing a set of MATLAB commands which carry out a prescribed set of 
operations. Generally, these will prompt for (or assume) certain data upon which the procedure is carried 
out. We use the term m-program (or often m-file) to refer to either an m-function or an m-procedure. 
Although most of the m-programs were written for MATLAB version 4.2, they work for versions 5.1, 5.2, 
and 7.04. The latter versions offer some new features which may make more efficient implementation of some 
of the m-programs, and which make possible some new ones. With one exception (so noted), these are not 
exploited in this collection, because of the pedagogical value of dealing with explicitly developed procedures 
whose dependence on basic MATLAB is displayed. These programs, with perhaps some exceptions, also run 
on the MATLAB alternatives SCILAB and OCTAVE. Users of these latter programs should be able to make 
appropriate adjustments if needed. In addition to the m-programs there is a collection of m-files for specific 
problems with properly formatted data which can be entered into the workspace by calling the file. These 
m-files come from a variety of sources ( e.g., exams or problem sets, hence the odd names) and may be useful 
for examples and exercises. This collection is in the text file New Prob mfiles. 
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